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Preface to the 
2nd edition 


Readers who are familiar with the first edition of Understanding Morphology 
(of which Martin Haspelmath was sole author) will find that the book's 
fundamental character has not changed. This book provides an introduction 
to linguistic morphology, with a focus on demonstrating the diversity of 
morphological patterns in human language and elucidating broad issues 
that are the foundation upon which morphological theories are built. 

At the same time, the material in this book has been substantially 
restructured and some topics have been expanded. The goal was to bring 
foundational issues to the forefront. This was accomplished mostly by 
expanding existing chapter sections or creating new chapter sections to 
centralize and focus discussion that was previously spread throughout a 
chapter. In some cases, however, the restructuring has been more radical. 
Notably, Chapter 3 from the first edition (‘Lexicon and Rules’) has been 
divided into two chapters, with more attention given to the question of 
whether the lexicon is fundamentally morpheme-based or word-based. 
Also, the chapter "Word-based Rules' (formerly Chapter 9) has been 
eliminated, with its material redistributed elsewhere, as relevant. 

There are also some new and expanded features: answers to each 
chapter's comprehension exercises can now be found at the back of the 
book; the glossary has been significantly enlarged; Chapter 5 has a new 
appendix on notation conventions for inflectional values; and perhaps most 
notably, nine chapters now contain exploratory exercises. The exploratory 
exercises are larger in scope than the comprehension exercises and extend 
the themes of the chapters. They guide readers through research questions 
in an open-ended way, asking them to gather and analyze data from a 
variety of sources, such as descriptive grammars, corpora, and native 
speaker consultants. The exercises are broadly constructed so that they can 
be tailored to the needs and interests of particular individuals or groups. 
In a classroom setting, instructors can use them with different levels of 
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students by adjusting their expectations regarding depth of analysis and 
methodological rigor. 

A number of people have helped to improve this new edition. First 
and foremost, we thank the series editors, Bernard Comrie and Greville 
Corbett, whose numerous suggestions and dedication to the project have 
greatly improved it. (Naturally, all errors remain the fault of the authors.) 
At Hodder we are also grateful to Bianca Knights and Tamsin Smith for 
their encouragement and deep well-springs of patience, and to Liz Wilson, 
for shepherding the project through production. 

In the end, textbooks are for students, and we would also like to thank 
Andrea Sims's morphology students at Northwestern University and The 
Ohio State University for their feedback. We especially thank Christine 
Davis, Caitlin Ferrarell, Laura Garofalo, Alexander Obal, Zach Richards, 
Cenia Rodriguez, and Honglei Wang. They provided extensive, detailed, 
valuable, and sometimes unexpected perspectives on the first edition. Their 
critique of some aspects of the second edition (particularly, drafts of the 
exploratory exercises) also proved crucial. 

This second edition contains some new examples, and we thank the 
following people for their help in understanding the relevant languages 
and providing appropriate examples: Hope Dawson (Sanskrit), Maggie 
Gruszczynska (Polish), Jessie Labov (Hungarian), and Amanda Walling 
(Old English). Any errors remain the fault of the authors. 

We are indebted to the various scholars and teachers who wrote reviews 
of the first edition, or who have passed on their experiences in teaching with 
the book. We are happy that the book was, on the whole, warmly received. 
We have tried to improve that which was deemed in need of improvement. 

Finally, we thank our families, and especially our partners, Susanne 
Michaelis and Jason Packer, for all manner of help and support. 


Leipzig, Germany 
Columbus, Ohio, USA 
July 2010 


Preface to the 
1st edition 


This book provides an introduction to the field of linguistic morphology. It 
gives an overview of the basic notions and the most important theoretical 
issues, emphasizing throughout the diversity of morphological patterns in 
human languages. Readers who are primarily interested in understanding 
English morphology should not be deterred by this, however, because 
an individual language can be understood in much greater depth when 
viewed against the cross-linguistic background. 

The focus of this book is on morphological phenomena and on broad 
issues that have occupied morphologists of various persuasions for a long 
time. No attempt is made to trace the history of linguists' thinking about 
these issues, and references to the theoretical literature are mostly confined to 
the ‘Further reading’ sections. I have not adopted any particular theoretical 
framework, although I did have to opt for one particular descriptive format 
for morphological rules (see Section 3.2.2). Readers should be warned that 
this format is no more ‘standard’ than any other format, and not particularly 
widespread either. But I have found it useful, and the advanced student 
will soon realize how it can be translated into other formats. 

Although it is often said that beginning students are likely to be confused 
by the presentation of alternative views in textbooks, this book does 
not pretend that there is one single coherent and authoritative view of 
morphology. Debates and opposing viewpoints are so much part of science 
that omitting them completely from a textbook would convey a wrong 
impression of what linguistic research is like. And I did not intend to remain 
neutral in these debates, not only because it would have been virtually 
impossible anyway, but also because a text that argues for a particular view 
is invariably more interesting than one that just presents alternative views. 

A number of people have helped me in writing this book. My greatest 
thanks go to the series editors, Bernard Comrie and Greville Corbett, who 
provided countless suggestions for improving the book. 

I also thank Renate Raffelsiefen for her expert advice on phonological 
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questions, as well as Tomasz Bak and Agnieszka Reid for help with Polish 
examples, and Claudia Schmidt for help with the indexes. 

Finally, I thank Susanne Michaelis for all kinds of help, both in very 
specific and in very general ways. This book is dedicated to our son, Gabriel. 


Martin Haspelmath 
Leipzig 
December 2001 


Abbreviations 


These abbreviations are consistent with the Leipzig Glossing Rules 
(v. February 2008). 


ABE abessive DEF definite 

ABL ablative DEM demonstrative 
ABS absolutive DEOBJ deobjective 
ACC accusative DESID desiderative 
ACT active DET determiner 
ADJ adjective DO direct object 
ADV adverb(ial) DU dual 

AFF affirmative DUR durative 

AG agent ELA elative 

AGR agreement ERG ergative 

ALL allative ESS essive 

ANTIC anticausative EXCL exclusive 
ANTIP antipassive F feminine 

AOR aorist FOC focus 

APPL applicative FUT future 

ART article G gender (e.g. G1 = gender 1) 
ASP aspect GEN genitive 

AUX auxiliary HAB habitual 

CAUS causative HYP hypothetical 
CLF classifier IMP imperative 
COMP complementizer IMPF imperfect(ive) 
COMPL completive IMPV imperative 
COND conditional INCL inclusive 
CONT continuative IND indicative 

CVB converb INDF indefinite 

DAT dative INESS inessive 


DECL declarative INF infinitive 


xvi ABBREVIATIONS 


INS 

INTF 

INTR/ intr. 
IOBJ 

LOC 


instrumental 
interfix 
intransitive 
indirect object 
locative 
masculine 
masdar 

noun 

neuter 
necessitative 
negation, negative 
nominative 
noun phrase 
object 

oblique 
Oxford English 
Dictionary 
optative 
patient 
partitive 
participle 
passive 
perfective 
plural 
possessive 
potential 


pP 
PRED 
PREF 
PRET 
PRF 
PRS 
PRIV 
PROG 
PROPR 
PST 
PTCP 
PURP 
RECP 
REFL 
REL 
REP 
SBJ 
SBJV 
SG 

SS 
SUBORD 
Sur 
TOP 
TR/ tr. 
V 

VP 


prepositional phrase 
predicate 
prefix 
preterite 
perfect 
present 
privative 
progressive 
proprietive 
past 
participle 
purposive 
reciprocal 
reflexive 
relative clause marker 
repetitive 
subject 
subjunctive 
singular 
same-subject 
subordinator 
suffix 

topic 
transitive 
verb 

verb phrase 


Introduction 


1.1 What is morphology? 


Morphology is the study of the internal structure of words.! Somewhat 
paradoxically, morphology is both the oldest and one of the youngest 
subdisciplines of grammar. It is the oldest because, as far as we know, the 
first linguists were primarily morphologists. The earliest extant grammatical 
texts are well-structured lists of morphological forms of Sumerian words, 
some of which are shown in (1.1). They are attested on clay tablets from 
Ancient Mesopotamia and date from around 1600 BCE. 


(1.1) badu ‘he goes away’ ingen ‘he went’ 
baduun ‘Igo away’ ingenen ‘IT went 
basidu ‘he goes away to him’ insigen ‘he went to him’ 
baSiduun ‘I go away to him’ insigenen ‘I went to him’ 


(Jacobsen 1974: 53-4) 


Sumerian was the traditional literary language of Mesopotamia but, by the 
second millennium BCE, it was no longer spoken as a medium of everyday 
communication (having been replaced by the Semitic language Akkadian), 
so it needed to be recorded in grammatical texts. Morphology was also 
prominent in the writings of the greatest grammarian of Antiquity, the 
Indian Panini (fifth century BCE), and in the Greek and Roman grammatical 
tradition. Until the nineteenth century, Western linguists often thought of 
grammar as consisting primarily of word structure, perhaps because the 


The reader should be aware that this sentence, while seemingly straightforward, conceals 
a controversy — there is no agreed upon definition of ‘word’. The relevant issues are 
addressed in Chapter 9, but here, and through most of the book, we will appeal to a loose, 
intuitive concept of ^word'. 
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classical languages Greek and Latin had fairly rich morphological patterns 
that were difficult for speakers of the modern European languages. 

This is also the reason why it was only in the second half of the nineteenth 
century that the term morphology was invented and became current. Earlier 
there was no need for a special term, because the term grammar mostly 
evoked word structure, i.e. morphology. The terms plionology (for sound 
structure) and syntax (for sentence structure) had existed for centuries 
when the term morphology was introduced. Thus, in this sense morphology 
is a young discipline. 

Our initial definition of morphology, as the study of the internal structure 
of words, needs some qualification, because words have internal structure in 
two very different senses. On the one hand, they are made up of sequences 
of sounds (or gestures in sign language), i.e. they have internal phonological 
structure. Thus, the English word nuts consists of the four sounds (or, as 
we will say, phonological segments) [nAts]. In general, phonological segments 
such as [n] or [t] cannot be assigned a specific meaning - they have a purely 
contrastive value (so that, for instance, nuts can be distinguished from cuts, 
guts, shuts, from nets, notes, nights, and so on). 

But often formal variations in the shapes of words correlate systematically 
with semantic changes. For instance, the words nuts, nights, necks, backs, 
taps (and so on) share not only a phonological segment (the final [s]), but 
also a semantic component: they all refer to a multiplicity of entities from 
the same class. And, if the final [s] is lacking (nut, night, neck, back, tap), 
reference is made consistently to only one such entity. By contrast, the 
words blitz, box, lapse do not refer to a multiplicity of entities, and there are 
no semantically related words *blit, *bok, *lap.? We will call words like nuts 
'(morphologically) complex words'. 

In a morphological analysis, we would say that the final [s] of nuts 
expresses plural meaning when it occurs at the end of a noun. But the 
final [s] in lapse does not have any meaning, and lapse does not have 
morphological structure. Thus, morphological structure exists if there are 
groups of words that show identical partial resemblances in both form and 
meaning. Morphology can be defined as in Definition 1. 


Definition 1: 
Morphology is the study of systematic covariation in the form and 
meaning of words. 


It is important that this form-meaning covariation occurs systematically 
in groups of words. When there are just two words with partial form- 
meaning resemblances, these may be merely accidental. Thus, one would 


? The asterisk symbol (*) is used to mark nonexistent or impossible expressions. 
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not say that the word hear is morphologically structured and related to 
ear. Conceivably, h could mean ‘use’, so h-ear would be ‘use one’s ear’, i.e. 
‘hear’. But this is the only pair of words of this kind (there is no *heye ‘use 
one’s eye’, “helbow ‘use one’s elbow’, etc.), and everyone agrees that the 
resemblances are accidental in this case. 

Morphological analysis typically consists of the identification of parts 
of words, or, more technically, constituents of words. We can say that the 
word nuts consists of two constituents: the element nut and the element 
s. In accordance with a widespread typographical convention, we will 
often separate word constituents by a hyphen: nut-s. It is often suggested 
that morphological analysis primarily consists in breaking up words into 
their parts and establishing the rules that govern the co-occurrence of 
these parts. The smallest meaningful constituents of words that can be 
identified are called morphemes. In nut-s, both -s and nut are morphemes. 
Other examples of words consisting of two morphemes would be break- 
ing, hope-less, re-write, cheese-board; words consisting of three morphemes 
are re-writ-ing, hope-less-ness, ear-plug-s; and so on. Thus, morphology could 
alternatively be defined as in Definition 2. 


Definition 2: 
Morphology is the study of the combination of morphemes to yield 
words. 


This definition looks simpler and more concrete than Definition 1. It would 
make morphology quite similar to syntax, which is usually defined as ‘the 
study of the combination of words to yield sentences’. However, we will 
see later that Definition 2 does not work in all cases, so we should stick to 
the somewhat more abstract Definition 1 (see especially Chapters 3 and 4). 

In addition to its main sense, where morphology refers to a subdiscipline 
of linguistics, it is also often used in a closely related sense, to denote a 
part of the language system. Thus, we can speak of ‘the morphology of 
Spanish’ (meaning Spanish word structures) or of ‘morphology in the 1980s’ 
(meaning a subdiscipline of linguistics). The term morphology shares this 
ambiguity with other terms such as syntax, phonology and grammar, which 
may also refer either to a part of the language or to the study of that part 
of the language. This book is about morphology in both senses. We hope 
that it will help the reader to understand morphology both as a part of the 
language system and as a part of linguistics. 

One important limitation of the present book should be mentioned right 
at the beginning: it deals only with spoken languages. Sign languages of 
course have morphology as well, and the only justification for leaving 
them out of consideration here is the authors’ limited competence. As more 
and more research is done on sign languages, it can be expected that these 
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studies will have a major impact on our views of morphology and language 
structure in general. 


1.2 Morphology in different languages 


Morphology is not equally prominent in all (spoken) languages. What one 
language expresses morphologically may be expressed by a separate word 
or left implicit in another language. For example, English expresses the 
plural of nouns by means of morphology (nut/nuts, night/nights, and so on), 
but Yoruba uses a separate word for expressing the same meaning. Thus, 
ọkùnrin means '(the) man’, and the word àwọn can be used to express the 
plural: àwon okünrin ‘the men’. But in many cases where several entities are 
referred to, this word is not used and plurality is simply left implicit. 

Quite generally, we can say that English makes more use of morphology than 
Yoruba. But there are many languages that make more use of morphology 
than English. For instance, as we saw in (1.1), Sumerian uses morphology to 
distinguish between ‘he went’ and ‘I went’, and between ‘he went’ and ‘he 
went to him’, where English must use separate words. In Classical Greek, there 
is a dual form for referring to two items, e.g. adelphd ‘two brothers’. In English 
it is possible to use the separate word ‘two’ to render this form, but it is also 
possible to simply use the plural form and leave the precise number of items 
implicit. 

Linguists sometimes use the terms analytic and synthetic to describe 
the degree to which morphology is made use of in a language. Languages 
like Yoruba, Vietnamese or English, where morphology plays a relatively 
modest role, are called analytic. Consider the following example sentences? 


(1.2) Yoruba 
Nwon 6 maa gba — pónün méwd Idsddse. 
they FUT PROGget pound ten weekly 
‘They will be getting £10 a week.’ 
(Rowlands 1969: 93) 


(1.3) Vietnamese 
Hai dia bo? nhau là tgi gia-dinh thang chóng. 
two individual leave each.other be because.of family guy husband 
‘They divorced because of his family.’ 
(Nguyen 1997: 223) 


For each example sentence from an unfamiliar language, not only an idiomatic translation 
is provided, but also a literal (‘“morpheme-by-morpheme’) translation. The key for 
abbreviations is found on pp. xv-xvi, and further notational conventions are explained in 
the Appendix to Chapter 2. 
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When alanguage has almost no morphology and thus exhibits an extreme 
degree of analyticity, it is also called isolating. Yoruba and Vietnamese, but 
not English, are usually qualified as isolating. Languages like Sumerian, 
Swahili or Lezgian, where morphology plays a more important role, would 
be called synthetic. Let us again look at two example sentences. 


(1.4) Swahili 
Ndovu wa-wili wa-ki-song-ana zi-umia-zo ni nyika. 
elephants PL-two 3PL-SUBORD-jostle-RECP 3sG-hurt-REL is grass 
"When two elephants jostle, what is hurt is the grass.’ 
(Ashton 1947: 114) 


(1.5) Lezgian 
Marf-adi «wici-n qalin _ st’al-ra-Idi qaw = gata-zwa-j. 
rain-ERG self-GEN dense  drop-PL-INs roof hit-IMPF-PST 
‘The rain was hitting the roof with its dense drops.’ 
(Haspelmath 1993: 140) 


When a language has an extraordinary amount of morphology and 
perhaps many compound words, it is called polysynthetic. An example is 
West Greenlandic.* 


(1.6) West Greenlandic 
Paasi-nngil-luinnar-para ilaa-juma-sutit. 
understand-not-completely-1sG.SBJ.35G.0BJ.IND come-want-2sG.PTCP 
‘I didn't understand at all that you wanted to come along.’ 
(Fortescue 1984: 36) 


The distinction between analytic and (poly)synthetic languages is not 
a bipartition or a tripartition, but a continuum, ranging from the most 
radically isolating to the most highly polysynthetic languages. We can 
determine the position of a language on this continuum by computing its 
degree of synthesis, i.e. the ratio of morphemes per word in a random text 
sample of the language. Table 1.1 gives the degree of synthesis for a small 
selection of languages. 


There is another definition of polysynthetic in use among linguists, according to which a 
language is polysynthetic if single words in the language typically correspond to multi- 
word sentences in other languages. In this book we will not use the term in this sense, but 
under such a definition, Swahili would be classified as a polysynthetic language. 
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Language Ratio of morphemes 
per word 
West Greenlandic 3.72 
Sanskrit 2.59 
Swahili 2.55 
Old English 2.12 
Lezgian 1.93 
German 1.92 
Modern English 1.68 
Vietnamese 1.06 


Table 1.1 The degree of synthesis of some languages 


Source: based on Greenberg (1959), except for Lezgian 


Although English has much more morphology than isolating languages like 
Yoruba and Vietnamese, it still has a lot less than many other languages. For 
this reason, it will be necessary to refer extensively to languages other than 
English in this book. 


1.3 The goals of morphological research 


Morphological research aims to describe and explain the morphological 
patterns of human languages. It is useful to distinguish four more specific 
sub-goals of this endeavour: elegant description, cognitively realistic 
description, system-external explanation and a restrictive architecture for 
description. 

(i) Elegant description. All linguists agree that morphological patterns 
(just like other linguistic patterns) should be described in an elegant and 
intuitively satisfactory way. Thus, morphological descriptions should 
contain a rule saying that English nouns form their plural by adding -s, 
rather than simply listing the plural forms for each noun in the dictionary 
(abbot, abbots; ability, abilities; abyss, abysses; accent, accents; ...). In a computer 
program that simulates human language, it may in fact be more practical to 
adopt the listing solution, but linguists would find this inelegant. The main 
criterion for elegance is generality. Scientific descriptions should, of course, 
reflect generalizations in the data and should not merely list all known 
individual facts. But generalizations can be formulated in various ways, 
and linguists often disagree in their judgements of what is the most elegant 
description. It is therefore useful to have a further objective criterion that 
makes reference to the speakers’ knowledge of their language. 

(ii) Cognitively realistic description. Most linguists would say that 
their descriptions should not only be elegant and general, but they should 
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also be cognitively realistic. In other words, they should express the same 
generalizations about grammatical systems that the speakers' cognitive 
apparatus has unconsciously arrived at. We know that the speakers' 
knowledge of English not only consists of lists of singulars and plurals, 
but comprises a general rule of the type 'add -s to a singular form to get 
a plural noun'. Otherwise speakers would be unable to form the plural 
of nouns they have never encountered before. But they do have this 
ability: if you tell an English speaker that a certain musical instrument is 
called a duduk, they know that the plural is (or can be) duduks. The dumb 
computer program that contains only lists of singulars and plurals would 
fail miserably here. Of course, cognitively realistic description is a much 
more ambitious goal than merely elegant description, and we would really 
have to be able to look inside people's heads for a full understanding of 
the cognitive machinery. Linguists sometimes reject proposed descriptions 
because they seem cognitively implausible, and sometimes they collaborate 
with psychologists and neurologists and take their research results into 
account. 

(iii) System-external explanation. Once a satisfactory description of 
morphological patterns has been obtained, many linguists ask an even 
more ambitious question: why are the patterns the way they are? In other 
words, they ask for explanations. But we have to be careful: most facts 
about linguistic patterns are historical accidents and as such cannot be 
explained. The fact that the English plural is formed by adding -s is a good 
example of such a historical accident. There is nothing necessary about 
plural -s: Hungarian plurals are formed by adding -k, Swedish plurals 
add -r, Hebrew plurals add -im or -ot, and so on. A frequent way to pursue 
explanation in linguistics is to analyze universals of human language, since 
these are more likely to represent facts that are in need of explanation at 
a deep level. And as a first step, we must find out which morphological 
patterns are universal. Clearly, the s-plural is not universal, and, as we 
saw in the preceding section, not even the morphological expression of 
the plural is universal — Yoruba is an example of a language that lacks 
morphological plurals. So even the fact that English nouns have plurals is 
no more than a historical accident. But there is something about plurals that 
is not accidental: nouns denoting people are quite generally more likely 
to have plurals than nouns denoting things. For instance, in Tzutujil, only 
human nouns have regular morphological plural forms (Dayley 1985: 139). 
We can formulate the universal statement in (1.7). 


(1.7) A universal statement: If a language has morphological plural 
forms of nouns at all, it will have plurals of nouns denoting people. 
(Corbett 2000: ch. 3) 


Because of its ‘if ... then’ form, this statement is true also of languages like 
English (where most nouns have plurals) and Yoruba (where nouns do not 
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have a morphological plural). Since it is (apparently) true of all languages, 
itis in all likelihood not a historical accident, but reflects something deeper, 
a general property of human language that can perhaps be explained 
with reference to system-external considerations. For instance, one might 
propose that (1.7) is the case because, when the referents of nouns are 
people, it makes a greater difference how many they are than when the 
referents are things. Thus, plurals of people-denoting nouns are more 
useful, and languages across the world are thus more likely to have them. 
This explanation (whatever its merits) is an example of a system-external 
explanation in the sense that it refers to facts outside the language system: 
the usefulness of number distinctions in speech. 

(iv) A restrictive architecture for description. Many linguists see an 
important goal of grammatical research in formulating some general design 
principles of grammatical systems that all languages seem to adhere to. 
In other words, linguists try to construct an architecture for description 
(also called grammatical theory) that all language-particular descriptions 
must conform to. For instance, it has been observed that rules by 
which constituents are fronted to the beginning of a sentence can affect 
syntactic constituents (such as whole words or phrases) but not 
morphological constituents (i.e. morphemes that are parts of larger words). 
Thus, (1.8b) is a possible sentence (it can be derived from a structure like 
(1.8a)), but (1.9b) is impossible (it cannot be derived from (1.9a)). (The 
subscript line __ stands for the position that the question word what would 
occupy if it had not been moved to the front.) 


(1.8) a. We can buy cheese. 
b. What can we buy — ? 


(1.9) a. We can buy a cheeseboard. 
b. *What can we buy a __ -board? 


This restriction on fronting (which seems to hold for all languages that have 
such a fronting rule) follows automatically if fronting rules (such as what- 
fronting) and morpheme-combination rules (such as compounding, which 
yields cheeseboard from cheese and board) are separated from each other in the 
descriptive architecture. A possible architecture for grammar is shown in 
Figure 1.1, where the boxes around the grammatical components ‘syntax’, 
‘morphology’ and ‘phonology’ symbolize the separateness of each of the 
components. 


morphology syntax phonology 


e morpheme- © fronting rules © pronunciation 
combination rules * word-combination rules 
rules 


Figure 1.1 A possible descriptive architecture for grammar 
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This architecture is restrictive because it automatically disallows certain 
logically possible interactions of rules (see Section 9.4 for more discussion). 
Many linguists assume that the architecture of grammar is innate — it is the 
same for all languages because it is genetically fixed for the human species. 
Theinnate part of speakers' grammatical knowledge is also called Universal 
Grammar. For these linguists, one goal of morphological research is to 
discover those principles of the innate Universal Grammar that are relevant 
for word structure. 

The goals (iii) and (iv) are similar in that both ask deeper, theoretical 
questions, and both exclusively concern universal aspects of morphology. 
And both are more ambitious than (i) and (ii) in that they involve 
explanation in some sense. Thus, one might ask questions such as "Why 
cannot constituents of words be fronted to the beginning of the sentence?’ 
and answer them from a Universal Grammar-oriented perspective with 
reference to a hypothesis about the innate architecture of grammar (‘Because 
fronting rules are part of the syntactic component, and morpheme- 
combinations are part of morphology, and syntax and morphology are 
separate"). However, explanations of this kind are strictly system-internal, 
whereas explanations of the kind we saw earlier are even more general 
in that they link universal properties of grammars to general facts about 
human beings that are external to the grammatical system. 

Itis a curious observation on the sociology of science that currently most 
linguists seem to be concerned either with system-external explanation 
or with formulating an architecture for grammatical description, but not 
with both goals simultaneously. There are thus two primary orientations 
in contemporary theoretical morphological research: the functionalist 
orientation, which aims at system-external explanation, and the generative 
(or formalist) orientation, which seeks to discover the principles of the 
innate grammatical architecture. However, it does not seem wise to 
divide the labour of morphological research in this way, because neither 
system-external factors nor innate principles can explain the whole range 
of morphological patterns. Accordingly, both goals will be simultaneously 
pursued in the more theoretically oriented parts of this book. 


1.4 A brief user's guide to this book 


Sources of data 


In this book we give examples from many different languages, and 
attributions for this data follow standard practice. For examples from less 
widely known languages, the reference is given after the example. However, 
when the examples are from well-known and widely studied languages 
such as Modern English, Russian, Standard Arabic or Old English, we 
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do not give a reference because the data can easily be obtained from any 
standard reference book. 


Sources of ideas 


In this book, we focus on morphological data and problems of analysis, not 
on the history of thinking about these issues in linguistics. Thus, we rarely 
mention names of particular authors in the text, and references to sources of 
ideas are given only in a few very specific cases (as in Table 1.1 and example 
(1.7)). In general, the reader is referred to the section ‘Further reading’, 
where important works on theoretical morphology are mentioned. 


Comprehension exercises 


Each chapter contains exercises designed to help the reader solidify 
understanding of the material. Answers to these exercises can be found at 
the end of the book. 


Exploratory exercises 


Many chapters also contain a longer exercise that extends the chapter 
material. These are exploratory in nature, so no answers are provided. 


Glossary 


The glossary contains the technical terms relating to morphology that 
are used in this book. In addition to giving a brief definition, the glossary 
also refers the reader to the most important places where the term 
is discussed in the text. These terms are printed in bold where they are first 
discussed in the text. 


Language index 


Many languages mentioned in this book will be unfamiliar to the reader. 
The language index serves to give information on each language, in 
particular its genealogical affiliation, the place where it is spoken, and its 
ISO 639-3 code. ISO 639-3 is an international standard that assigns a unique 
code to every language. The reader is encouraged to use these codes to find 
more information about the languages discussed in the book; the on-line 
language encyclopaedia Ethnologue (www.ethnologue.com) is particularly 
helpful in this regard. 


Spelling and transcription 


Morphology of spoken languages deals with spoken words, so ideally all the 
examples should be in phonetic transcription in this book. But since many 
languages have a conventional spelling that renders the pronunciation 
more or less faithfully, it was more practical and less confusing to adopt that 
spelling for the examples here. (Although English spelling is not particularly 
close to the pronunciation, English examples will usually be given in the 
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spelling, because it is assumed that the readers know their pronunciation.) 
Examples cited in the spelling (or conventional transliteration) are always 
printed in italics, whereas examples cited in phonetic transcription are 
printed in ordinary typeface and are usually included in square brackets. 
Readers not familiar with phonetic transcription should consult any 
phonetics or phonology textbook. 


Abbreviations 


A list of abbreviations (especially abbreviations of grammatical terms) is 
found on pp. xv-xvi. 


Summary of Chapter 1 


Morphology is most simply defined as the study of the combination of 
morphemes to yield words, but a somewhat more abstract definition 
(as the study of systematic covariation in the form and meaning of 
words) will turn out to be more satisfactory. Different languages vary 
strikingly in the extent to which they make use of morphology. The 
goals of morphological research are (on the descriptive level) elegant 
and cognitively realistic description of morphological structures, 
plus (on the theoretical level) system-external explanation and the 
discovery of a restrictive architecture for description. 


Further reading 


For an elementary introduction to morphology, see Coates (1999) or 
Katamba and Stonham (2006). 

Other morphology textbooks that are somewhat similar in scope to the 
present book are Bauer (2003), Bubenik (1999), and Plag (2003) (as well as 
Scalise (1994), in Italian, and Plungian (2000), in Russian). Spencer (1991) is a 
very thorough introduction that concentrates on the generative orientation 
in morphology. Matthews (1991) puts particular emphasis on the definition 
of morphological concepts. Carstairs-McCarthy (1991) gives an excellent 
overview of the theoretical debates in the 1970s and 1980s. Booij (2007) 
devotes a chapter to the mental processing and storage of words. Aronoff 
and Fudeman (2005) is a source for techniques of morphological analysis. 

The most comprehensive work on morphology that has ever been 
written by a single author is Mel'éuk (1993-2000) (five volumes, in French). 
Although its style is somewhat unusual, it is very readable. 

Reference works that are devoted exclusively to morphology are Spencer 
and Zwicky (1998) and Booij, Lehmann and Mugdan (2000-2004). A 
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bibliography is Beard and Szymanek (1988). Bauer (2004) is a glossary of 
morphological terms. 

The complementarity of the functionalist and the generative approaches 
to morphology is explained and emphasized in the introductory chapter of 
Hall (1992). 

An introduction to a sign language that also discusses morphology is 
Sutton-Spence and Woll (1999). 

A note on the history of the term morphology: in the biological sense ('the 
study of the form of animals and plants’), the term was coined by Johann 
Wolfgang von Goethe (1749-1832), and, in the linguistic sense, it was first 
used by August Schleicher (1859). 


Comprehension exercises 


1. Which of the following English words are morphologically complex? 
For each complex word, list at least two other words that provide 
evidence for your decision (i.e. words that are both semantically and 
formally related to it). 


nights, owl, playing, affordable, indecent, religion, indolent, bubble, during, 
searched, hopeful, redo 


2. Identify the morphological constituents and describe their meanings in 
the following Mandarin Chinese nouns. 


chàngcí ‘libretto’ dingdéeng ‘top light’ 
changjt ‘gramophone’ diànche 'streetcar, tram" 
chuánwei ‘stern’ diàndeng 'electric lamp' 
ciwei ‘suffix’ dianjt ‘electrical machine’ 
dianli ‘electric power’ giche ‘car’ 

dianshi ‘television’ gichuan ‘steamship’ 
dóngwüxué ‘zoology’ shanding ‘summit’ 
dóngwiüyóu ‘animal oil’ shichang ‘sightseeing’ 
dongwiyuan 'zoo' shili ‘eyesight’ 
fángding ‘roof’ shücí ‘number word’ 
fángke ‘tenant’ shuiche ‘watercart’ 
feichuán 'airship' shuili ‘waterpower’ 
feijt ‘aeroplane’ shuxué ‘mathematics’ 
feiyi ‘flying fish’ wéideng 'tail light" 
huache ‘festooned vehicle’ wéishui ^tail water' 
huayuán ‘flower garden’ yóudeng 'oil lamp' 

jiche ‘locomotive’ youzhi ‘oil paper’ 
jidoli ‘strength of one’s legs’ ytiyou ‘fish oil’ 

kéefang ‘guest house’ zhihua ‘paper flower’ 
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3. Identify the morphological constituents and their meanings in the 
following Tzutujil verbs (Dayley 1985: 87) (A note on Tzutujil spelling: 
x is pronounced [J], and 7 is pronounced [?].) 


xinwari ‘Tslept’ 

neeli ‘he or she leaves’ 
ne7eeli 'they leave' 
nixwari ‘you(PL) sleep’ 
xateeli 'you(so) left’ 
natwari ‘you(sc) sleep’ 


xoqeeli 


ninwari 
xixwari 


xe7eeli 
xwari 


‘we left’ 

‘I sleep 
'you(PL) slept 
‘they left’ 

‘he or she slept’ 


How would you say ‘T left’, ‘he or she sleeps’, ‘we sleep’? 


In the following list of Hebrew words, find at least three sets of word 


pairs whose two members covary formally and semantically, so that a 
morphological relationship can be assumed. For each set of word pairs, 
describe the formal and semantic differences. 


kimut ‘wrinkling’ 
diber ‘he spoke’ 
ħašav ‘he thought’ 
sagra ‘she shut’ 
hava ‘she thought’ 
kalat ‘he received’ 
maklet ‘radio receiver’ 
kalta ‘she received’ 


kimet ‘he wrinkled’ 


mahsev 
masger 
dibra 
milmel 
kimta 
milmla 
sagar 
dibur 


‘computer’ 
‘lock’ 

‘she spoke’ 
"he muttered’ 
‘she wrinkled’ 
‘she muttered’ 
“he shut’ 
‘speech’ 


Basic concepts 


W: have seen that morphological structure exists if a group of words 
shows partial form-meaning resemblances. In most cases, the relation 
between form and meaning is quite straightforward: parts of words bear 
different meanings. Consider the examples in (2.1). 


(2.1) read read-s read-er read-able 
wash wash-es wash-er wash-able 
write write-s writ-er writ-able 
kind kind-ness un-kind 
happy happi-ness un-happy 


friendly — friendli-ness un-friendly 


These words are easily segmented, ie. broken up into individually 
meaningful parts: read + s, read + er, kind + ness, un + happy, and so on. These 
parts are called morphemes.' Words may of course consist of more than two 
morphemes, e.g. un-happi-ness, read-abil-ity, un-friend-ly, un-friend-li-ness. 
Morphemes can be defined as the smallest meaningful constituents of 
a linguistic expression. When we have a sentence such as Camilla met an 
unfriendly chameleon, we can divide it into meaningful parts in various 
ways, e.g. Camilla/met an unfriendly chameleon, or Camilla/met/an/unfriendly/ 
chameleon, or Camilla/met/an/un/friend/ly/chameleon. But further division is 
not possible. When we try to divide chameleon further (e.g. cha/meleon), we 
do not obtain parts that can be said to be meaningful, either because they are 
not found in any other words (as seems to be the case with meleon), or because 
the other words in which they occur do not share any aspect of meaning 
with chameleon (cf. charisma, Canadian, caboodle, capacity, in which it would be 
theoretically possible to identify a word part cha/ca-). Thus, chameleon cannot 


! Some approaches question the usefulness of the notion ‘morpheme’. We will discuss these 
extensively in Chapters 3 and 4, but for the moment it is helpful to begin in this more 
conventional way. 
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be segmented into several morphemes; it is monomorphemic. Morphemes 
are the ultimate elements of morphological analysis; they are, so to speak, 
morphological atoms. 

In this chapter we introduce some other fundamental concepts and their 
related terms, starting with lexemes and word-forms. 


2.1 Lexemes and word-forms 


The most basicconcept of morphology is of course the concept ‘word’. For the 
sake of convenience, let us assume for the moment that a word is whatever 
corresponds to a contiguous sequence of letters? Thus, in one sense the 
first sentence of this paragraph consists of twelve words, each separated 
by a blank space from the neighbouring word(s). And in another sense 
the sentence has nine words - there are nine different sequences of letters 
separated by spaces. But when a dictionary is made, not every sequence 
of letters is given its own entry. For instance, the words live, lives, lived and 
living are pronounced differently and are different words in that sense. But 
a dictionary would contain only a single entry Live. The dictionary user 
is expected to know that live, lives, lived and living are different concrete 
instantiations of the ‘same’ word LIVE. Thus, there are three rather different 
notions of ^word'. When a word is used in some text or in speech, that 
occurrence of the word is sometimes referred to as a word token. In this 
sense the first sentence in the paragraph consists of twelve words. The other 
two senses of the term ^word' are not defined in reference to particular texts; 
they correspond to the ‘dictionary word’ and the ‘concrete word’. Since this 
distinction is central to morphology, we need special technical terms for the 
two notions, lexeme and word-form, respectively. 

A lexeme is a word in an abstract sense. LIVE is a verb lexeme. It represents 
the core meaning shared by forms such as live, lives, lived and living. In most 
languages, dictionaries are organized according to lexemes, so it is usually 
reasonable to think of a lexeme as a 'dictionary word'. Although we must 
assign names to lexemes to be able to talk about them, lexemes are abstract 
entities that have no phonological form of their own. LIVE is therefore just a 
convenient label to talk about a particular lexeme; the sequence of sounds 
[lv] is not the lexeme itself. Sometimes we will use the convention of 
writing lexemes in small capital letters. 

By contrast, a word-form is a word in a concrete sense. It is a sequence 
of sounds that expresses the combination of a lexeme (e.g. LIVE) and a set 


Of course, we should really define words in terms of sounds, since language is primarily 
a spoken (not written) medium, and there are other problems with this definition as well. 
But it is sufficient for the present purposes. A more sophisticated approach is deferred to 
Chapter 9. 
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of grammatical meanings (or grammatical functions) appropriate to that 
lexeme (e.g. third person singular present tense). Lives is a word-form. 
Thus, word-forms are concrete in that they can be pronounced. 

Lexemes can be thought of as sets of word-forms, and every word-form 
belongs to one lexeme. The word-forms live, lives, lived, and living all belong 
to the lexeme Live. Word-forms belonging to the same lexeme express 
different grammatical functions, but the same core concept. When a word- 
form is used in a particular text or in speech, this instance of use is a word 
token. The first sentence of this paragraph thus has sixteen word tokens, 
fifteen word-forms (of is repeated), and thirteen lexemes (e.g. lexemes and 
lexeme both belong to LEXEME). 

In the most interesting case, lexemes consist of a fair number of word- 
forms. The set of word-forms that belongs to a lexeme is often called a 
paradigm. The paradigm of the Modern Greek noun lexeme riLos ‘friend’ 
is given in (2.2). (Earlier we saw a partial paradigm of two Sumerian verb 
lexemes (Section 1.1).) 


(2.2) The paradigm of riLos 
singular plural 


nominative filos fili 
accusative filo filus 
genitive filu filon 


This paradigm contains six different word-forms and expresses notions of 
number (singular, plural) and case (nominative, accusative, genitive)? By 
contrast, English nouns have no more than four word-forms (e.g. ISLAND: 
island, islands and perhaps island's, islands’), but the notional distinction 
between lexemes and word-forms is no less important when the paradigm 
is small. In fact, for the sake of consistency we have to make the distinction 
even when a lexeme has just a single word-form, as in the case of many 
English adjectives (e.g. the adjective soLID, which has only the word-form 
solid). 

It is not always immediately clear how many word-forms belong to a 
lexeme. This is shown by the paradigm of the Latin noun lexeme INSULA 
‘island’ in (2.3). Are there ten word-forms in this lexeme's paradigm, or 
seven? 


(2.3) The paradigm of INSULA 


singular plural 
nominative insula insulae 
accusative insulam insulüs 
genitive insulae insularum 
dative insulae insulis 
ablative insula insulis 


3 The meanings of the cases are discussed in Chapter 5. They are also given in the Glossary. 
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Above we defined a word-form in terms of a lexeme and aset of grammatical 
functions. The importance of the latter part of the definition is seen in 
paradigms like iNsULA. Although there are only seven different sequences of 
sounds in (2.3), we can still say that the paradigm of INSULA has ten word- 
forms, because ten different sets of grammatical functions are expressed 
(e.g. genitive singular and nominative plural are distinct, despite having 
the same form). 

Not all morphological relationships are of the type illustrated in (2.2) 
and (2.3). Different lexemes may also be related to each other, and a set of 
related lexemes is sometimes called a word family (though it should more 
properly be called a lexeme family): 


(2.4) Two English word families 
a. READ, READABLE, UNREADABLE, READER, READABILITY, REREAD 
b. LOGIC, LOGICIAN, LOGICAL, ILLOGICAL, ILLOGICALITY 


Although everyone recognizes that these words are related, they are given 
their own dictionary entries. Thus, the difference between word-forms and 
lexemes, and between paradigms and word families, is well established 
in the practice of dictionary-makers, and thereby known to all educated 
language users. 

At this point we have to ask: why is it that dictionaries treat different 
morphological relationships in different ways? And why should linguists 
recognize the distinction between paradigms and word families? After 
all, linguists cannot base their theoretical decisions on the practice of 
dictionary-makers — it ought to be the other way round: lexicographers 
ought to be informed by linguists' analyzes. The nature of the difference 
between lexemes and word-forms will be the topic of Chapter 5, but the 
most important points will be anticipated here. 

(i) Complex lexemes (such as READER or LOGICIAN) generally denote new 
concepts that are different from the concepts of the corresponding simple 
lexemes, whereas word-forms often exist primarily to satisfy a formal 
requirement of the syntactic machinery of the language. Thus, word-forms 
like reads or reading do not stand for concepts different from read, but they 
are needed in certain syntactic contexts (e.g. the girl reads a magazine; reading 
magazines is fun). 

(ii) Complex lexemes must be listed separately in dictionaries because 
they are less predictable than word-forms. For instance, one cannot 
predict that the lexeme illogicality exists, because by no means all 
adjectives have a corresponding -ity lexeme (cf. nonexistent words like 
*naturality, *logicality). It is impossible to predict that a specialist in logic 
should be called a logician (rather than, say, a “logicist), and the meaning 
of complex lexemes is often unpredictable, too: a reader can denote not 
just any person who reads, but also a specific academic position (in the 
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British system) or even a kind of book. By contrast, the properties of 
word-forms are mostly predictable and hence do not need to be listed 
separately for each lexeme. 

Thus, there are two rather different kinds of morphological relationship 
among words, for which two technical terms are commonly used: 


(2.5) Kinds of morphological relationship 


inflection (= inflectional morphology): the relationship between 
word-forms of a lexeme 


derivation (= derivational morphology): the relationship between 
lexemes of a word family 


Morphologists also use the corresponding verbs inflect and derive. For 
instance, one would say that the Latin lexeme INSULA is inflected (or 
inflects) for case and number, and that the lexeme READER is derived from 
the lexeme READ. A derived lexeme is also called a derivative. 

(Note that we are making a terminological simplification here: a lexeme 
is an abstract entity without phonological form so, strictly speaking, 
one lexeme cannot be derived from another. When morphologists talk 
about derived lexemes, they mean that form a (e.g. reader), corresponding 
to lexeme A (READER), is derived from form b (read), corresponding to 
lexeme B (READ). However, since this phrasing becomes quite clumsy, 
morphologists commonly simplify the terminology. We will do the same 
in this book.) 

It is not always easy to tell how word-forms are grouped into lexemes. 
For instance, does the word-form nicely belong to the lexeme NICE, or does 
it represent a lexeme of its own (NICELY), which is in the same word family 
as NICE? Issues of this sort will be discussed in some detail in Chapter 5. 
Whenever it is unclear or irrelevant whether two words are inflectionally 
or derivationally related, the term word will be used in this book instead 
of lexeme or word-form. And for the same reason even the most technical 
writings on morphology often continue to use the term word. 

Some morphologically complex words belong to two (or more) word 
families simultaneously. For instance, the lexeme FIREWOOD belongs both 
in the family of FIRE and in the family of woop. Such relationships are 
called compounding, and lexemes like FrREWOOD are called compound 
lexemes, or just compounds, for short. Compounding is often grouped 
together with derivation under the category of word formation (i.e. lexeme 
formation). The various conceptual distinctions that we have seen so far are 
summarized in Figure 2.1. 
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morphological relationships 


inflection word formation 
(‘word-form formation’) (‘lexeme formation’) 
derivation compounding 
paradigms: word families: 
e.g. live, lives, living, . . . e.g. LOGIC, LOGICIAN, . . . FIREWOOD 


island, islands, . . . 


Figure 2.1 Subdivisions of morphology 


2.2 Mfixes, bases and roots 


In both inflection and derivation, morphemes have various kinds of 
meanings. Some meanings are very concrete and can be described easily 
(e.g. the meanings of the morphemes wash, logic, chameleon, un-), but other 
meanings are abstract and more difficult to describe. For instance, the 
morpheme -al in logic-al can perhaps be said to mean ‘relating to’ (cf. logic- 
al, mathematic-al, physic-al, natur-al), -able in read-able can be said to mean 
‘capable of undergoing a process’, and the meaning of -ity is ‘quality’ (e.g. 
readability is 'the quality of being readable"). Some meanings are so abstract 
that they can hardly be called meanings. For example, the Latin morpheme 
-m in insula-m (see (2.3)) serves to mark the direct object in a sentence, but 
it is difficult to say what its meaning is. And English -s in read-s is required 
when the subject is a third person singular noun phrase, but again it is 
unclear whether it can be said to have meaning. In such cases, linguists are 
more comfortable saying that these morphemes have certain grammatical 
functions. But, since the ultimate purpose of grammatical constructions is to 
express meaning, we will continue to say that morphemes bear meaning, 
even when that meaning is very abstract and can be identified only in the 
larger grammatical context. 

Word-forms in an inflectional paradigm generally share (at least) one 
longer morpheme with a concrete meaning and are distinguished from 
each other in that they additionally contain different shorter morphemes, 
called affixes. An affix attaches to a word or a main part of a word. It usually 
has an abstract meaning, and an affix cannot occur by itself. For instance, 
Russian nouns have different affixes in the paradigm in (2.6), which have 
case meaning (-a for nominative, -u for accusative, etc.), and Classical 
Nahuatl nouns have different affixes in the paradigm in (2.7) that indicate a 
possessor (no- for ‘my’, mo- for ‘your’, etc.). 
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(2.6) Russian case inflection (singular forms) 


nominative ruk-a ‘hand’ 
accusative ruk-u 
genitive ruk-i 
dative ruk-e 
locative ruk-e 
instrumental ruk-oj 

(2.7) Nahuatl possessor inflection 
1sc no-cal ‘my house’ 
2sG mo-cal ‘your (sc) house’ 
3sc i-cal “his/her house’ 
IPL to-cal 'our house’ 
2PL amo-cal ‘your (PL) house’ 
3PL in-cal ‘their house’ 


(Sullivan 1988: 26) 


Morphologists often use special terms for different kinds of affixes, 
depending on their position within the word. Affixes that follow the main 
part of the word are called suffixes (e.g. the Russian case suffixes in (2.6)), 
and affixes that precede it are called prefixes (e.g. the Classical Nahuatl 
possessor prefixes in (2.7)). The part of the word that an affix is attached to 
is called the base, e.g. ruk- in Russian, or -cal in Classical Nahuatl. Affixes 
and bases can, of course, be identified both in inflected word-forms and 
in derived lexemes. For instance, in read-er, read-able and re-read, read is the 
base, -er and -able are suffixes, and re- is a prefix. A base is also sometimes 
called a stem, especially if an inflectional (as opposed to derivational) affix 
attaches to it. 

There are still other kinds of affixes, besides prefixes and suffixes, which 
are briefly described and illustrated in Table 2.1. 


Types of affixes Examples 

suffix: follows the base Russian -a in ruk-a ‘hand’ 
English -ful in event-ful 

prefix: precedes the base Classical Nahuatl no- in no-cal ‘my house’ 
English un- in unhappy 

infix: occurs inside the base Arabic -t- in (i)5-t-agala ‘be occupied’ 


(base: Sagala) 
Tagalog -um- in s-um-ulat ‘write’ (base: 


sulat) 
circumfix: occurs on both sides German ge-...-en, e.g. ge-fahr-en ‘driven’ 
of the base (base: fahr) 


Table 2.1 Types of affixes 
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Bases or stems can be complex themselves. For instance, in activity, -ity 
is a suffix that combines with the base active, which itself consists of the 
suffix -ive and the base act. A base that cannot be analyzed any further into 
constituent morphemes is called a root. In readability, read is the root (and 
the base for readable), and readable is the base for readability, but it is not a 
root. Thus, the base is a relative notion that is defined with respect to the 
notion ‘affix’. (We will refine this definition of ‘base’ in the next chapter to 
account for words which are difficult to describe in terms of morphemes, 
but will keep the idea that bases are relative notions.) Affixes are similar to 
roots in that they cannot be further analyzed into component morphemes; 
they are primitive elements. 

A base may or may not be able to function as a word-form. For instance, 
in English, cat is both the base of the inflected form cats and itself a word- 
form (active is a word-form and the base for the derived form activity, etc.). 
However, in Italian word-form gatti (‘cats’) can be broken up into the suffix 
-i (‘plural’) and the base gatt- (‘cat’), but gatt- is not a word-form. Italian 
nouns must inflect for number, and even in the singular, an affix is required 
to express this information (e.g. gatt-o ‘cat’, gatt-i ‘cats’). In this respect 
Italian differs from English. Bases that cannot also function as word-forms 
are called bound stems. 

Roots and affixes can generally be distinguished quite easily, but sometimes 
there are problems. For example, the Salishan language Bella Coola has a 
number of suffix-like elements that do not seem to have an abstract meaning 
(see 2.8). In (2.9), we see two examples of how these elements are used. 


(2.8) -us ‘face’ -lik ‘body’ 
-an ‘ear’ -altwa ‘sky, weather’ 
-uc ‘mouth’ -lt ‘child’ 
-d ‘foot’ -Ist ‘rock’ 
-ak ‘hand’ -lxs ‘nose’ 


(29)a.  qué-at-ic 
wash-foot-I.him 
‘I am going to wash his foot’ (lit.: ‘foot-wash him’) 


b.  kma-lxs-c 
hurt-nose-I 
‘my nose hurts’ (lit.: ‘I nose-hurt’) 
(Mithun 1998: 300-5) 


In these cases, it is not immediately clear whether we are dealing with suffix— 
root combinations or with root-root combinations, i.e. compounds. The 
elements in (2.8) do not occur as lexemes by themselves but must always be 
combined with other roots. In this respect they have a property that is typical 
of affixes, and scholars of Salishan languages have generally regarded them 
as such. However, if affixes are defined as ‘short morphemes with an abstract 
meaning’, then these elements are very atypical affixes, to say the least. 
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English has a number of morphemes that are similarly difficult to classify 
as roots or affixes. Some examples are given in (2.10). 


(2.10) biogeography aristocrat 


bioethics autocrat 
bioengineering democrat 
biorhythm Eurocrat 
bioterrorism plutocrat 
biomedicine technocrat 
biochip theocrat 


The elements bio- and -crat could be regarded as affixes because they do not 
occur as independent lexemes, but their very concrete meaning and also 
their (not particularly short) form suggest that they should be regarded as 
bound stems that have the special property of occurring only in compounds. 


2.3 Morphemes and allomorphs 


While the distinction between roots on the one hand and affixes on the other 
is by itself quite useful, these concepts turn out to be more complicated 
than the simple picture that we have seen so far. One of the most common 
complications is that morphemes may have different phonological shapes 
under different circumstances. For instance, the plural morpheme in English is 
sometimes pronounced [s] (as in cats [Keets]), sometimes [z] (as in dogs [dpgz]), 
and sometimes [-az] (as in faces [feisoz]). When a single affix has more than 
one shape, linguists use the term allomorph. Affixes very often have different 
allomorphs — two further cases from other languages are given in (2.11). 


(2.11) a. Korean accusative suffix (marker of direct object): two allomorphs 


-ul ton ‘money’ ton-ul ‘money-Acc’ 
chayk ‘book’ chayk-ul — 'book-Acc' 
-lul tali ‘leg’ tali-lul ‘leg-acc’ 
sakwa ‘apple’ sakwa-lul ‘apple-acc’ 
b. Turkish first person possessive suffix: five allomorphs 
-im ev ‘house’ ev-im ^my house' 
dil ‘language’ dil-im ‘my language’ 
-üm köy ‘village’ köy-üm ‘my village’ 
gün ‘day’ giin-tim ‘my day’ 
-um yol ‘way’ yol-um ‘my way’ 
tuz ‘salt’ tuz-um ‘my salt 
-ım* ad ‘name’ ad-ım ‘my name’ 
kız ‘girl’ kiz-im ‘my daughter’ 
-m baba ‘father’ baba-m ‘my father’ 


4 


The Turkish letter 1 corresponds to IPA [w] (high back unrounded vowel). 
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Not only affixes, but also roots and stems may have different allomorphs 
(or, as linguists often say, ‘exhibit allomorphy’). For instance, English verbs 
such as sleep, keep, deal, feel, mean, whose root has the long vowel [i:] in the 
present-tense forms, show a root allomorph with short [e] in the past-tense 
forms (slept, kept, dealt, felt, meant). Cases of stem allomorphy from other 
languages are given in (2.12). 

(2.12) a. German: a voiced obstruent becomes voiceless in syllable-final 


position 
Tag [ta:k] ‘day’ Tage [ta:go] ‘days’ 
Hund [hunt] ‘dog’ Hunde [hunds] ‘dogs’ 
Los [lo:s] ‘lot’ Lose [lo:za] ‘lots’ 


b. Russian: when the stem is followed by a vowel-initial suffix, the 
vowel o/e is often dropped if it is the last vowel in the stem 

zamok ‘castle’ ^ zamk-i ‘castles’ 

kamen’ ‘stone’  kamn-i ‘stones’ 

nemec ‘German’ nemc-y ‘Germans’ 

nogot' ‘nail nogt-i ‘nails’ 
The crucial properties which define the German stems [ta:k] and [ta:g] or 
the Korean suffixes [-ul] and [-lul] as being allomorphs are that they have 
the same meaning and occur in different environments in complementary 
distribution. Additionally, all our examples so far have shown only fairly 
small differences in the shapes of morphemes, which can by and large be 
regarded as mere differences in pronunciation. Being phonologically similar 
is acommon property of allomorphs, but is not a necessary one. Allomorphs 
that have this property are phonological allomorphs. The formal relation 
between two (or more) phonological allomorphs is called an alternation. 

Linguists often describe alternations with a special set of morphophonological 
rules, which were historically phonetically motivated, but affect morphology. 
Morphophonological rules and the difference between them will be discussed 
more extensively in Chapter 10, and we will consider them only briefly here. 
Metaphorically, it is often convenient to think about phonological 

allomorphy in terms of a single underlying representation that is 
manipulated by rules under certain conditions. The end result, i.e. what 
is actually pronounced, is the surface representation. For instance, the 
alternations in (2.12a, b) can be described by the underlying representations 
in the (a) examples below, and by the respective rules in the (b) examples. 
The surface representations (resulting word-forms) are given in (c). 
(2.13) a. underlying:  [ta:g] 'day.sc' 

b. rule: a voiced obstruent becomes voiceless in syllable-final 

position (application: [ta:g] — [ta:k])° 
c. surface: [ta:k] 'day.sc' 


5 In this (morpho)phonological context, the arrow (‘XY’) means that X turns into Y. 
pho)p 8 
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(2.14) a. underlying: [ta:g-ə] 'day-Pr' 
b. rule: a voiced obstruent becomes voiceless in syllable-final 
position (doesn't apply) 
c. surface: [ta:g-9] 'day-Pr' 


(2.15) a. underlying: [zamok] 'castle.sc' 
b. rule: o/e in the final stem syllable disappears when the stem is 
followed by a vowel-initial suffix (doesn't apply) 
c. surface: [zamok] 'castle.sc' 


(2.16) a. underlying: [zamok-i] 'castle-Pr^ 
b. rule: o/e in the final stem syllable disappears when the stem is 
followed by a vowel-initial suffix (application: [zamok-i] > 
[zamk-i]) 

. surface:  [zamk-i]  ‘castle-PL’ 


o 


Notice that for (2.13) and (2.14), the underlying representation (morpheme) 
meaning 'day' is the same, and the rule applies only when its conditions are 
met. The same is true for (2.15) and (2.16). That the alternation is produced 
by the morphophonological rule is made particularly clear in this way: the 
underlying representation shows no allomorphy at all. 

In many cases of phonological allomorphy, it is evident that the historical 
reason for the existence of the morphophonological rule and thus for 
the allomorphy is to facilitate pronunciation. For instance, if the English 
plural were uniformly [-z], words such as cats and faces would be almost 
unpronounceable (try to pronounce [keetz] and [feisz]!). Since this is a 
textbook on morphology, we cannot go into greater phonological detail 
here, but phonological allomorphs will be taken up again in Chapter 10. 

Overall, the main point here is that at some level, phonological allomorphs 
represent a single morpheme whose form varies slightly depending upon 
the phonological context created by combining morphemes. For this reason, 
it is common to think of the morpheme as the more abstract underlying 
representation, rather than the more concrete surface word-form. The 
underlying and surface representations may be the same, or they may 
differ as a result of the application of morphophonological rules. However, 
it is important to remember that the underlying representation is a tool 
used by linguists. It may or may not reflect the kinds of generalizations 
that language users make. There are examples where it seems unlikely that 
there is a single underlying representation in the minds of speakers; we see 
this in another type of allomorphy: suppletion. 

Besides phonological allomorphs, morphemes may also have allomorphs 
that are not at all similar in pronunciation. These are called suppletive 
allomorphs. For instance, the English verb go has the suppletive stem wen 
in the past tense (wen-t), and the English adjective good has the suppletive 
stem bett in the comparative degree (better). The Russian noun čelovek 
‘human being’ has the suppletive stem ljud’ in the plural (ljud-i ‘people’). 
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The Spanish verb ir 'go' has the suppletive stem va- in the present tense 
(vas ‘you go’, va ‘s/he goes’, vamos ‘we go’, etc.). The term suppletion is 
most often used to refer to stem shape (ir and va- are both verbal stems), 
and some linguists reserve the term for this use, but others also talk about 
affixes as being potentially suppletive (see (2.17) later in this chapter for an 
example from Persian). 

It is not always easy to decide whether an alternation is phonological or 
suppletive, because the categories are end points on a continuum of traits, 
rather than a clear-cut binary distinction. Some examples are therefore 
intermediary. For instance, what about English buy/bought, catch/caught, 
teach/taught? The root allomorphs of these verbs ([bai]/[bo:], [keet/]/[kox], 
[ti:tf]/ [t»:]) are not as radically different as go/wen-t, but they are not similar 
enough to be described by phonological rules either. In such cases, linguists 
often speak of weak suppletion, as opposed to strong suppletion in cases 
like go/went, good/better. 

Forboth weak and strong suppletion, it is theoretically possible to posit an 
underlying representation from which suppletive allomorphs are derived 
by rule. However, considering that suppletive allomorphs share little or no 
form, the underlying representation would need to be very abstract, and the 
rules converting the underlying representation to surface representations 
could not exist to make pronunciation easier. There is no evidence that 
language users make such abstractions, so underlying representations are 
perhaps best treated as useful metaphors. 


Type of allomorphy Description Example 

Phonological Alternation could be English plural [-z], [-s],[-9z]; 

allomorphy described by arule of Russian zamok/zamk- 
pronunciation 


Weak suppletive Allomorphs exhibit English buy/bough-, catch/ 
allomorphy some similarity, caugh-, etc. 

but this cannot 

be described by 

phonological rules 


Strong suppletive — Allomorphs exhibit no English good/bett- 
allomorphy similarity at all 


Table 2.2 Types of allomorphy: summary 


When describing the allomorphy patterns of a language, another important 
dimension is the conditioning of the allomorphy, i.e. the conditions under 
which different allomorphs are selected. Phonological allomorphs typically 
have phonological conditioning. This means that the phonological context 
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determines the choice of allomorph. For instance, the English plural 
allomorphs [-z], [-s] and [-ez] are strictly phonologically conditioned: [-ez] 
appears after a sibilant (ie. [s], [z], [f], [3], [tf] or [ds], e.g. face-s, maze-s, 
bush-es, garage-s, church-es, badge-s), |-s] appears after a voiceless non-sibilant 
obstruent (e.g. cat-s, book-s, lip-s, cliff-s) and [-z] appears elsewhere (e.g. bag-s, 
bell-s, key-s). The Korean accusative allomorphs -ul/-lul (see (2.11a)) are also 
phonologically conditioned: -ul appears after a consonant, -lul after a vowel. 

By contrast, stem suppletion usually has morphological conditioning, 
meaning that the morphological context (usually, grammatical function) 
determines the choice of allomorph (e.g. Spanish ir 'go' in the infinitive and 
future tense, va- in the present and imperfective past tense and fu- in the 
perfective past tense). 

And, finally, we find lexical conditioning, where the choice of a suppletive 
affix allomorph is dependent on other properties of the base, for instance 
semantic properties as in (2.17). 


(2.17) Persian plural marking: human nouns -an, non-human nouns -ha 


-an mærd ‘man’ mærd-an ‘men’ 
geda ‘beggar’ geday-an ‘beggars’ 

-ha gorbe ‘cat’ gorbe-ha ‘cats’ 
ettefaq ‘incident’  ettefaq-ha ‘incidents’ 


(Mahootian 1997: 190) 


Lexical conditioning is also involved where the choice of allomorph cannot 
be derived from any general rule and must be learned individually for each 
word. This is the case for the English past participle suffix -en: speakers must 
simply learn which verbs take this suffix and not the more common suffix -ed. 


Type of conditioning Description Example 

Phonological Choice of allomorphs English plural depends 

conditioning depends on on final sound in stem 
phonological context 

Morphological Choice of allomorphs ^ Spanish ir, va- or fu-, 

conditioning depends on the depending on tense 
morphological context 

Lexical conditioning Choice of allomorphs English past participle 
depends on the -en/-ed is unpredictable 
individual lexical and depends on 
item individual verbs 


Table 2.3 Types of conditioning: summary 


5 It is clear that phonological allomorphs can also have morphological conditioning. 
However, whether suppletive allomorphs can have phonological conditioning is subject 
to ongoing debate. 
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Summary of Chapter 2 


This chapter introduced several concepts that are basic to morphology. 
Three different notions of word have to be distinguished: the word token 
(‘use of a word in a text or in speech’), the lexeme ('abstract, dictionary 
word’) and the word-form (‘concrete word’). Inflectional morphology 
describes the relationship between the word-forms in a lexeme's 
paradigm, and derivational morphology describes the relation between 
lexemes. Complex words can often be segmented into morphemes, 
which are called affixes when they are short, have an abstract meaning, 
and cannot stand alone, and roots when they are longer and have a 
more concrete meaning. When two or more morphemes express the 
same meaning and occur in complementary distribution, they are often 
considered allomorphs. Allomorphs come in two types, phonological 
and suppletive, depending on the degree to which they are similar in 
form. Suppletive allomorphs are further subdivided into examples 
of strong suppletion and weak suppletion. The distinction between 
strong suppletion, weak suppletion and non-suppletion is a continuum. 
Allomorphs may have phonological, morphological, or lexical conditioning. 


Appendix. Morpheme-by-morpheme glosses 


When presenting longer examples (such as sentences or entire texts) from a 
language that the reader is unlikely to know, linguists usually add interlinear 
morpheme-by-morpheme glosses to help the reader understand the structure 
of the examples. We saw instances of such glosses in (1.2)-(1.6), and we will see 
more examples later in this book. Interlinear morpheme-by-morpheme glosses 
are an important aspect of 'applied morphology', and they are needed in other 
areas of linguistics as well (e.g. by syntacticians and fieldworkers). We will 
therefore explain the most important principles involved. 

The conventions used in this book are based on the Leipzig Glossing 
Rules (www.eva.mpg.de/lingua/resources/glossing-rules.php; accessed 
July 2010). The Leipzig Glossing Rules are more detailed than the principles 
presented here, but include the following: 

(i) One-to-one correspondence. Each element of the object language is 
translated by one element of the metalanguage (in the present context, this 
is English). Hyphens separate both the word-internal morphemes in the 
object language and the gloss, e.g. 

Japanese 

Taroo ga hana o migotoni ^ sak-ase-ta. 

Taro  NoM flower Acc beautifully bloom-cAus-rsT 

‘Taro made the flowers bloom beautifully.’ 

(Shibatani 1990: 309) 
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Object-language words and their glosses are aligned at their left edges. The 
interlinear gloss is usually followed by an idiomatic translation in quotation 
marks. 

(ii) Grammatical-value abbreviations. Grammatical elements (both 
function words and inflectional affixes) are not translated directly, but are 
rendered by grammatical-value labels, generally in abbreviated form (see 
the list of abbreviations on pp. xv-xvi). To highlight the difference between 
the value labels and the ordinary English words, the value labels are usually 
printed in small capitals, as seen in the above example. 

(iii) Hyphens and periods. Hyphens are used to separate word-internal 
morphemes in object-language examples, and each hyphen in an example 
corresponds to a hyphen in the gloss. Periods are used in the gloss when 
two gloss elements correspond to one element in the example. This may be 
when a single example element corresponds to a multi-word expression in 
the gloss, e.g. 


Turkish 
cik-mak 
come.out-INF 
‘to come out’ 


or it may be when a single example element corresponds to several 
inflectional meanings: 


Latin 
insul-drum 
island-GEN.PL 
‘of the islands’ 


or it may be when an inflectional meaning is expressed in a way that cannot 
be segmented, e.g. 


Albanian 

fik fiq 
fig.sc fig.PL 
‘fig’ ‘figs’ 


(The Albanian letter q corresponds here to IPA [c] (voiceless palatal stop), 
and k corresponds to [k] (voiceless velar stop).) 

The period is omitted when the two meanings are person and number, 
e.g. 

Tzutujil 

x-in-wari 

COMPL-1SG-sleep 

‘I slept 

(Dayley 1985: 87) 
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Here '1sc' is used instead of '1.sc'. (The period is felt to be redundant 
because person and number combine so frequently.) 

(iv) Possible simplifications. Sometimes the precise morpheme division 
is irrelevant or perhaps unknown. Authors may still want to give infor- 
mation on the inflectional meanings, and again periods are used to separate 
these elements, e.g. 


Japanese Latin 

sakaseta insularum 
bloom.cAUS.PST island.GEN.PL 
‘made to bloom’ ‘of the islands’ 


Sometimes morpheme-by-morpheme glosses are used also when the 
example is not set off from the running text. In such cases the gloss is 
enclosed in square brackets, e.g. ‘the Japanese verb saka-se-ta [bloom-caus- 


x 


PST] “made to bloom" ....’. 


Comprehension exercises 


1. Somali exhibits a great amount of allomorphy in the plural formation 
of its nouns. Four different allomorphs are represented in the following 
examples. Based on these examples, formulate a hypothesis about the 
phonological conditions for each of the plural allomorphs. (In actual 
fact, the conditions are more complex, but for this exercise, we have to 
limit ourselves to a subset of the data and generalizations.) 


SINGULAR PLURAL 


awowe awowayaal ‘grandfather’ 
baabaco baabacooyin ‘palm’ 

beed beedad 'egg' 

buug buugag ‘book’ 

cashar casharro ‘lesson’ 

fure furayaal ‘key’ 

ilmo ilmooyin ‘tear’ 

miis miisas ‘table’ 

gado gadooyin ‘lunch’ 
shabeel shabeello ‘leopard’ 
waraabe waraabayaal ‘hyena’ 

xidid xididdo ‘eagle’ 

Based on the generalizations found, form the plural of the following 
nouns: 

tuulo ‘village’ 


tog ‘river’ 


30 CHAPTER 2 BASIC CONCEPTS 


albaab ‘door’ 
buste "blanket 
(Berchem 1991: 98-117) 


2. The English past participle suffix spelled -ed has three different 
alternants: [d], [t], and [od]. Are these phonologically or morphologically 
conditioned? Try to describe the conditioning factors in an approximate 
way. 

3. Italian inhabitant nouns (e.g. Anconetano ‘person from Ancona’) exhibit 
different degrees of similarity to the corresponding city names. Order 
the following pairs of city names and inhabitant names on a scale from 
clear suppletion in the base form to clear non-suppletion, depending 
on the number of segments in which the base for the inhabitant noun 
differs from the base for the city name (see Crocco-Galéas 1991). Assume 
that word-final vowels are suffixes in Italian; the base for Ancona would 
thus be Ancon-. Additionally, inhabitant nouns contain the suffixes -an, 
-in, or -es, so the base for Anconetano is Anconet-. 


CITY NAME INHABITANT NOUN 
Ancona Anconetano 

Bologna Petroniano 

Bressanone Brissinese 

Domodossola Domese 

Gubbio Eugubino 

Ivrea Eporediese 

Milano Milanese "Milan" 
Napoli Partenopeo ‘Naples’ 
Palermo Palermitano 

Palestrina Prenestino 

Piacenza Piacentino 

Savona Savonese 

Trento Trentino 

Treviso Trevigiano 

Venezia Veneziano "Venice" 
Volterra Volaterrano 


Exploratory exercise 


This chapter introduced the idea that the set of word-forms belonging to 
the same lexeme is known as a paradigm. Readers may have noticed that 
a table-like format was used to list members of paradigms. The Modern 
Greek noun paradigm that we encountered in (2.2) is repeated below as 
(2.18). Here, the rows list cases and the columns list numbers. This format is 
sometimes called a grid. The grid format will be used elsewhere in the book, 
especially in Chapters 5 and 8, where inflectional morphology is discussed. 
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(2.18) The paradigm of FILos ‘friend’ 
singular plural 


nominative _filos fili 
accusative filo filus 
genitive filu filon 


The grid format subtly implies that for a given lexeme there should be a 
word-form corresponding to each combination of case and number. The 
format makes sense because the expectation is usually fulfilled; in Greek, 
noun lexemes almost always have six word-forms corresponding to the 
six cells in the grid. There is thus some sense in which paradigms can be 
‘complete’ or ‘incomplete’. 

In this exercise, you will explore whether the same notion applies to 
word families. Do word families usually have an equal number of members 
and parallel content? For instance, if the verb lexeme READ has READABLE, 
UNREADABLE, READER and REREAD in its word family, does every other verb 
lexeme X also have XABLE, UNXABLE, XER and REX? Or do word families 
often have some lexemes, but not others that seem equally possible? Finally, 
does the notion of ‘completeness’ apply to word families? Is it reasonable 
to talk about a word family as being incomplete? We will address some of 
these questions in Chapters 5 and 6, but in this exercise you will anticipate 
that discussion with some exploratory analysis. 

English is used here for demonstration purposes because it is familiar 
to all readers, but you are encouraged to investigate your native language, 
whatever that might be. 


Instructions 


Step 1: Create a list of at least 20 adjectival (or nominal, verbal...) lexemes, 
e.g. CLEAR, FALSE, HAPPY. For each one, list all of the lexemes belonging to 
its word family. Use a dictionary to prod your memory if needed, but do 
not rely on dictionary entries when they contradict your own judgements. 
For instance, the Oxford English Dictionary lists the following entries (among 
others) as being related to the adjective happy: happify, happiless, happily, 
happiness, happious, happy-go-lucky, happy-slappy, enhappy, mishappy, and 
trigger-happy. Some of these, like happiness, are quite normal, but others, 
like happify, happious, and enhappy, seem odd at best. For the authors, the 
word family of HAPPY does not contain these three lexemes. For each word 
family in your data set, decide its content for yourself, according to your 
own usage and judgements about whether a given lexeme is possible. 

Step 2: Compare the sets. The lexeme CLEAR is in the same set as CLARIFY, 
and FALSE is related in a parallel fashion to FALsiFy, but the word family for 
HAPPY does not contain HAPPIFY (despite being in the dictionary). Do the 
word families in your data set mostly have parallel content, or mostly not? 

Step 3: Discuss the content of these word families in terms of the following 
questions: 
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1) In your data, was it ever hard to decide whether two lexemes belong 
to the same word family? If so, why? Discuss the issues surrounding any 
choices you had to make. 

2) What kinds of meanings are expressed by the derivationally related 
lexemes? It is not important at this stage to be precise about terminology 
- describe them as best you can. How do these compare to the inflectional 
meanings that you have seen in this chapter? Can derivationally related 
lexemes be organized into grids in the way inflectionally related word- 
forms are? Why or why not? 

3) Does it make sense to talk about word families as complete, or at least 
potentially complete? Are paradigms and word families similar or different 
in this respect? Explain your reasoning. (Both a ‘yes’ and a ‘no’ answer to 
the question is probably possible. The important part is that you explain 
and justify your answer.) 


Rules 


o far we have talked about morphological structure in mostly static 

terms: words 'have' affixes or 'share' parts, they 'exhibit' resemblances 
and they 'consist of' a base and an affix. However, it is often convenient 
to describe complex words as if they were the result of a process or event. 
Thus, we said that affixes ‘are attached’ to the base or that they ‘combine’ 
with it. Linguists use such process terms very frequently. They talk about 
elements ‘being affixed’ to bases, or about a complex word ‘being derived 
from’ (i.e. built on the basis of) a simpler one.' 

Most linguists agree that complex words need not be derived from 
simpler ones each time they are used. Instead, frequently used words 
are probably listed in the lexicon. The lexicon is the linguist's term for 
the mental dictionary that language users must be equipped with, in 
addition to the grammatical rules of their language. When a linguist says 
that something is listed in the lexicon, this means that it must be stored in 
speakers' memories. If a complex word has its own lexical entry (i.e. listing 
in the lexicon), it does not need to be actively derived from a simpler form; 
it can simply be retrieved from memory when needed. (The content of the 
lexicon will be discussed in Chapter 4.) 

Still, speakers have the capacity to create, and hearers can understand, 
an almost unlimited number of new words. The set of words in a language 
is never quite fixed. There must therefore be some processes by which new 
complex words are created. And even when a complex word is likely to be 
listed in the lexicon, it is useful to think of the relationship between it and 
its base in terms of these same processes. These processes, and how they 
can be formally described using morphological rules, are the topic of this 
chapter. 


1 The term derive is somewhat confusing because it is also commonly applied to inflectional 


morphology, not just to derivational morphology. Thus, one would say that the 
comparative form warmer is derived from the positive form warm, or that the past-tense 
form played is derived from the present-tense form play. 
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3.1 Morphological patterns 


Morphological structure is much more various than simply affixes 
combining with bases. For example, in German, one way of forming the 
plural of a noun is by replacing a back vowel of the singular form (e.g. [U], 
[a:], [9]) by a front vowel (e.g. [v], [e:], [e], spelled ii, d, 6). Some examples 
are given in (3.1). 


(3.1) singular plural 


Mutter Mütter ‘mother(s)’ 
Vater Väter ‘father(s)’ 
Tochter Tochter ‘daughter(s)’ 
Garten Gürten ‘garden(s)’ 
Nagel Nagel ^nail(s)' 


Here, we have a clear-cut example of morphological structure in that a 
recurrent aspect of meaning (‘plural’) corresponds to a recurrent aspect 
of form (vowel quality), but the plural word-forms cannot be segmented 
into two morphemes. Intuitively, it is easier to think of the stem vowel 
having been changed, rather than a morpheme having been added. We 
will use the term morphological pattern to cover both examples in which 
morphological meaning can be associated with a segmentable part of 
the word, and examples where this is not possible. A morpheme, then, 
is a frequently occurring, special subtype of morphological pattern. We 
begin by examining a range of morphological patterns, both common and 
uncommon, from various languages. 


3.1.1 Mfixation and compounding 


Linguists often distinguish two basic types of morphological patterns: 
concatenative, which is when two morphemes are ordered one after 
the other, and non-concatenative, which is everything else. Most of the 
examples of morphologically complex words that we have seen so far can 
be neatly segmented into roots and affixes, and are therefore concatenative 
patterns. In process terms, these can be described as derived by affixation 
(subtypes suffixation, prefixation, etc.) and compounding. 

Affixation involves more than just combining two morphemes. A rule 
of affixation is also a statement about which types of morphemes may 
combine. This is the combinatory potential of the affix. (Other terms that 
are widely used are subcategorization frame and selectional restriction.) 
For example, un- and intelligent may combine via affixation to form 
unintelligent, but it is not the case that any affix and any base can combine. 
The suffix -able attaches only to verbs; intelligentable is not a potential word 
of English because intelligent is an adjective, not a verb. And un- can attach 
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to adjectives, but does not generally attach to nouns; ungrass is also not a 
possible word of English (although the soft drink company 7UP played on 
this restriction to grab attention with the slogan '7UP: the uncola’). 

The combinatory potential of an affix cannot be entirely predicted from 
its meaning. For example, the prefix non- is virtually identical in meaning to 
un-, but it commonly attaches to nouns (e.g., non-achiever) and less readily 
to adjectives (non-circular, but *non-kind, ??non-intelligent). Combinatory 
potential must therefore be specified along with other information about 
the affixation process. As with un-, non- and -able, the word-class of the base 
(noun, verb, adjective, etc.) is an important factor for combinatory potential. 
Linguists thus sometimes say that affixes 'select' a particular word-class to 
attach to. 

The combinatory potential of the prefix un- can be expressed with the 
notation in (3.2a), where '—' stands for the affix and 'A' indicates both the 
word-class of the base and the position of the base relative to the affix. 


(3.2) a. Combinatory potential of un- [— A] 
b. Combinatory potential of -able [V—] 
c. Combinatory potential of comparative -er [A—] 
d. Combinatory potential of -ful [N—] 


Affixation is thus a process that has a number of important parameters 
(we will encounter more later in the book), but which can nonetheless be 
described in a fairly straightforward way. 


3.1.2 Base modification 


At the same time, a range of morphological patterns exists that cannot 
be straightforwardly segmented into two meaningful parts. As with the 
German example in (3.1), it is often easiest to describe non-concatenative 
patterns as results of processes or operations that apply to a base form. 
Some non-concatenative patterns exist in a wide variety of languages, 
including English, and will probably be familiar to the reader. Others may 
seem more ‘exotic’. Example (3.3) shows a non-concatenative pattern in 
Albanian. Here, a stem-final [k] in the singular becomes [c] in the plural, [g] 
becomes [f], and [1] becomes [j]: 


(3.3) SINGULAR PLURAL 
armik [...k] armig [...c] ‘enemy / enemies’ 
fik [...k] fig [...c] ‘fig(s)’ 
frëng [...g] fréngj [..F] ‘Frenchman/-men’ 
murg [...g] murgj [...F] ‘monk(s)’ 
papagall [...4] papagaj [...j] ‘parrot(s)’ 
portokall [...3] portokaj [...j] 'orange(s) 


(Buchholz and Fiedler 1987: 264-5) 
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Non-concatenative patterns force us to revise our original definition of 
base (‘the part of a word that an affix is attached to’; Section 2.2). We want 
to say that armik is the base for armiq, but our earlier definition is adequate 
only in the context of concatenation. Thus, (3.4) is a better definition of base. 


(3.4) The base of a morphologically complex word is the element to 
which a morphological operation applies. 


This subsumes the earlier definition and allows us to also talk about non- 
concatenative patterns in a satisfactory way. 

One important class ofnon-concatenative patterns is base modification (or 
stem modification/alternation). This is a collective term for morphological 
patterns in which the shape of the base is changed without adding 
segmentable material. A common type of base modification pattern results 
from changing place of articulation. For example, the Albanian example in 
(3.3) involves palatalization of the last consonant of the base (producing 
the sound at the palate), and the German example in (3.1) consisted of 
fronting of the stem vowel (changing the place of articulation so that 
the vowel is pronounced more towards the front of the mouth). We can 
also find examples of morphological patterns involving changed manner 
of articulation. In Scottish Gaelic, indefinite nouns undergo weakening 
of word-initial obstruent consonants in the genitive plural. Here, stop 
consonants become fricatives: [b] becomes [v], [k] becomes [c], [g] becomes 
[y], and [t^] becomes [h]. (Note that for some pairs, a change in place of 
articulation also occurs.) 


(3.5) NOM SG INDF GEN PL INDF 
[b...] bard .] bhàrd ‘bard’ 


5 


[ 
[K....] ceann [C...] cheann ‘head’ 
[g...] guth [y...] ghuth ‘voice’ 
[t"...] tuagh [h...] thuagh ‘axe’ 
[b...] balach [v...] bhalach ‘boy’ 


(Calder 1923: 81-93) 


Standard Arabic, Quechua and Hindi/Urdu have morphological patterns 
involving length. In Standard Arabic, a causative verb is formed by 
gemination. This means that a consonant becomes lengthened, in this case 
the second consonant in the root (e.g. darasa ‘learn’ — darrasa ‘teach’, wagafa 
‘stop (intr.)’  waqqafa ‘stop (tr.)', damara ‘perish’ — dammara ‘annihilate’).? 
In Huallaga Quechua, the first person singular of verbs is formed in part by 
lengthening the final stem vowel (in phonology, long vowels are usually 
indicated by a colon). 


? The arrow symbol (—) is used to express a relationship of derivation: ‘A — B’ means that 


B is derived from A. 
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(3.6) 2ND SINGULAR 1ST SINGULAR 
aywa-nki ‘you go’ aywa: ‘T go' 
aywa-pti-ki ‘when you went’ aywa-pti: ‘when I went’ 
aywa-shka-nki “you have gone’ aywa-shka: ‘ʻI have gone’ 


(Weber 1989: 99, 118) 


By contrast, in Hindi/Urdu, intransitive verbs are formed from transitive 
verbs by shortening the stem vowel (e.g. ma:r- ‘kill’  mar- ‘die’, kho:l- 
‘open (tr.)’ — khul- ‘open (intr.)’, phe:r- ‘turn (tr. — phir- ‘turn (intr.)’). 

Base modification also commonly takes the form of a tonal change or 
stress shift. For example, in Chalcatongo Mixtec, adjectives are formed 
from nouns by changing the tone pattern of the base to a high-high pattern 
(indicated by two acute accents): 


(3.7) NOUN ADJECTIVE 
ká?ba ‘filth’ karba ‘dirty’ 
Zuil 'rock žúú ‘solid, hard’ 
xara ‘foot’ xara ‘standing’ 


(Macaulay 1996: 64) 


Somewhat similarly, English has verbs that differ from their corresponding 
nouns only by stress placement (e.g. discount (noun) €» discóunt (verb), 
import (noun) €» import (verb), insult (noun) €» insult (verb)). English also 
has a few cases where a verb is derived from a noun by a different operation 
- voicing the last consonant of the root (e.g. hou[s]e (noun)  hou[z]e (verb), 
thie[f] (noun)  thie[v]e (verb), wrea[0] (noun) > wrea[d]e (verb)). 

Finally, twointeresting butless commonly attested morphological patterns 
result from subtraction (the signalling of a morphological relationship by 
deleting one or more segments from the base), and metathesis (switching 
of two or more segments within the base). For example, one way of forming 
the plural in Murle is by subtracting the last consonant: 


(3.8) SINGULAR PLURAL 


nyoon nyoo “lamb(s)’ 

Wawoc wawo ‘white heron(s)’ 
onyiit onyii ‘rib(s)’ 

rottin rotti ‘warrior(s)’ 


(Arensen 1982: 40-1) 


And Clallam marks a distinction between actual and non-actual events by 
metathesis of the first vowel with the preceding consonant. (Clallam verbs 
must have suffixes marking the agent, but only the stem is given here to 
keep the data simple.) 
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(3.9) NON-ACTUAL ACTUAL 
qq í- qig- ‘restrain’ 
pk” ô- pók” - ‘smoke’ 
t’ca- tóc- ‘shatter’ 
k"'sá- k’ ós- ‘count’ 


(Thompson and Thompson 1969: 216) 


The examples in this section show that while many base modification 
patterns may seem odd to the English speaker, there are few inherent 
restrictions on how morphological relationships can be signalled. In addition 
to adding segments, morphological patterns can be formed by deleting, 
rearranging, lengthening, shortening, weakening, palatalizing, etc. Also, 
non-concatenative morphological processes are similar to concatenative 
processes in having restrictions that are equivalent to combinatory potential. 
For example, the tonal change pattern shown in (3.7) applies only to nouns, 
and the resulting complex word is an adjective. This is comparable to -able 
applying only to verbs, with the resulting complex word being an adjective. 
We can therefore think of combinatory potential as a restriction that applies 
broadly to all morphological processes, and not only to affixation. 


3.1.3 Reduplication 


A very common morphological operation is reduplication, whereby part 
of the base or the complete base is copied and attached to the base (either 
preceding or following it). In Malagasy, adjectives with stress on the first 
syllable copy the entire base. In the reduplicated form the meaning of the 
adjective is less intense. 


(3.10) Reduplication of entire stem: Malagasy 


be ‘big, numerous’ be-be ‘fairly big, numerous’ 
fotsy ‘white’ fotsi-fotsy ‘whitish’ 

maimbo ‘stinky’ maimbo-maimbo ‘somewhat stinky’ 
hafa ‘different’ hafa-hafa ‘somewhat different’ 


(Keenan and Polinsky 1998: 571) 


In Ponapean and Mangap-Mbula, only part of the base is copied. In 
Ponapean a consonant + vowel (CV) sequence is prefixed to the stem, 
whereas in Mangap-Mbula a vowel + consonant (VC) sequence is suffixed 
to the stem. 


(3.11) Reduplication of a CV sequence before the base: Ponapean 


duhp ‘dive’ du-duhp ‘be diving’ 
mihk ‘suck’ mi-mihk ‘be sucking’ 
wehk ‘confess’ we-wehk ‘be confessing’ 


(Rehg 1981: 78) 
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(3.12) Reduplication of a VC sequence after the base: Mangap-Mbula 


kuk ‘bark’ kuk-uk ‘be barking’ 
kel ‘dig’ kel-el ‘be digging’ 
kan ‘eat’ kan-an ‘be eating’ 


(Bugenhagen 1995: 53) 


The element that is attached to the base often consists of both copied 
segments and fixed segments, so that a kind of mixture between affix and 
reduplicant results. Such elements may be called duplifixes. 


(3.13) Plurals in Somali: duplifix -aC 


buug ‘book’ buug-ag ‘books’ 
fool ‘face’ fool-al ^ 'faces' 
koob ‘cup’ koob-ab ‘cups’ 
jid ‘street’ jid-ad ‘streets’ 


(Berchem 1991: 102) 


(3.14) ‘Sort of’ adjectives in Tzutujil: duplifix -Coj 


saq ‘white’ saq-soj ‘whitish’ 
rax ‘green’ rax-roj ‘greenish’ 
q'eq ‘black’ qeq-qoj ‘blackish’ 
tziül ‘dirty’ tz'il-tz'oj 'dirtyish' 


(Dayley 1985: 213) 


Linguists often treat reduplication as affixation of a template and copying 
of the root as needed to fill out the segments of that template. For Ponapean, 
the template is CV-, where C and V stand for 'empty' slots that can be filled 
with a consonant and vowel, respectively. The prefixation of the template 
itself is easily understood as concatenation, and it is reasonable to think 
of the template as a kind of morpheme. However, it is less clear that the 
copying process is concatenative; it seems to have more in common with 
gemination or vowel lengthening. 


3.1.4 Conversion 


Finally, the limiting case of a morphological pattern is conversion, in 
which the form of the base remains unaltered. A standard example is the 
relationship between some verbs and nouns in English: 


(3.15) NOUN VERB 
hammer | hammer 
plant plant 
ship ship 
walk walk 


drink drink 
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Since we defined morphological patterns as partial resemblances in form 
and meaning among groups of words, conversion can be regarded as 
morphological in nature only if this definition is relaxed somewhat, because 
the resemblance in form is total here. Conversion is generally invoked only 
for derivational morphology, and primarily for relating two lexemes that 
differ only in lexical class. 


3.1.5 Outside the realm of morphology 


Sometimes a number of additional types are given under the heading of 
morphological operations, such as alphabet-based abbreviations (acronyms 
such as NATO, and alphabetisms such as CD (pronounced [si:'di:]), Ph.D. 
(pronounced [pi:eitf’di:])), clippings (e.g. fridge from refrigerator, pram from 
perambulator) and blends (e.g. smog from smoke and fog, infotainment from 
information and entertainment). These are clearly operations that can be 
used to create new words. However, they do not fall under morphology, 
because the resulting new words do not have different meanings to the 
longer words from which they are formed. Thus, not all processes of word- 
creation fall into the domain of morphological structure, and abbreviations 
and clippings will play no role in this book. 


3.2 Two approaches to morphological rules 


Taking this range of observed phenomena, we can now turn our attention 
to analysis and ask what these morphological patterns indicate about 
morphological structure. The ultimate goal is to create a system of 
morphological rules that mimics speakers’ linguistic knowledge, but this is 
not always a straightforward process. In addition to accurately representing 
morphological generalizations, rules should be elegant and cognitively 
realistic (see Section 1.3 for discussion of these goals). Moreover, the 
generalizations that we consider important to describe with morphological 
rules depend in part on the kinds of explanations that we posit. 

For instance, concatenative patterns are more common in the world’s 
languages than non-concatenative patterns. Why is this? And is it important 
that this generalization be directly reflected by the system of morphological 
rules that we formulate? One possibility is that languages favour 
concatenative morphological patterns due to some inherent property of the 
language system, perhaps because morphological structure is fundamentally 
similar to syntactic structure. If so, our system of morphological rules should 
be structured in such a way that it places restrictions on non-concatenative 
patterns, and maximizes similarity to syntactic structure. Alternatively, 
concatenative patterns might be more common for reasons external to the 
morphological system, for example, having to do with how languages 
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change. This would suggest that morphological rules themselves need not 
be highly restrictive because the frequency of concatenative patterns does 
not result directly from the structure of the morphological system. The 
preponderance of concatenative patterns thus provides one example of how 
the type of explanation that we pursue, system-external explanation versus 
inherently restrictive architecture, affects our approach to morphological 
rules. There are many such issues. 

Onthe whole, the emphasis in this book is on questions of substance rather 
than questions of formal description. But in this section, two representative 
formalisms for morphological rules will be presented and contrasted, 
to help bring two major approaches into clear focus. One emphasizes 
commonalities between morphology and syntax and favours a restrictive 
architecture of description. The other tends to minimize the importance of 
parallels between syntax and morphology and invests in system-external 
explanations. As such, it sees a restrictive formal architecture as less 
important. We will call these the morpheme-based model and the word- 
based model, respectively. The morpheme-based model is associated with 
the morpheme-combination approach to morphology: ‘Morphology is the 
study of the combination of morphemes to yield words' (Section 1.1). The 
word-based model represents a view of morphology consistent with the 
following definition: 'Morphology is the study of systematic covariation 
in the form and meaning of words' (also from Section 1.1). As we shall see, 
each approach has its own advantages and disadvantages. 


3.2.1 The morpheme-based model 


In the morpheme-based model, morphological rules are thought of as 
combining morphemes in much the same way that syntactic rules combine 
words. To see how this works, we can use the syntactic phrase structure rules 
in (3.16) to create a sentence by replacing elements on the left of the ‘=’ by 
elements on the right. In this notation, elements in parentheses are optional; 
curly brackets and commas represent a choice between alternative options. 


(3.16) Phrase-structure rules in syntax 


a. sentence = noun phrase + verb phrase 

b. noun phrase = (i) j|determiner (+ adjective) + noun 
(ii)  |sentence 

c. verb phrase = verb (+ noun phrase) 

d. determiner = the, a, some, ... 

e. noun = cat, rat, bat, ... 

f. verb = chased, thought, slept, ... 

g. adjective = big, grey, ... 


To produce the sentence A big cat chased the bat, we need the following 
individual steps, utilizing the rules in (3.16) (X  Y' means ‘insert Y for X’). 
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(3.17) sentence > noun phrase + verb phrase (by 3.16a) 
noun phrase — determiner + adjective + noun (by 3.16b) 
verb phrase — verb + noun phrase (by 3.16c) 
noun phrase — determiner + noun (by 3.16b) 
determiner + adjective + noun — a big cat (by 3.16d, g, e) 
verb — chased (by 3.16f) 
determiner + noun the bat (by 3.16d, e) 


sentence: A big cat chased the bat. 


Likewise, in order to describe the structure of English words like 
cheeseboard, bags, unhappier, eventfulness, one could make use of the word- 
structure rules in (3.18), which are analogous to the syntactic phrase 
structure rules above. 


(3.18) Word-structure rules 


a. word-form = stem (+ inflectional suffix) 

b. stem = (i) | (deriv. prefix +) root (+ deriv. suffix) 
(ii) | stem+stem 

c. inflectional suffix = -s, -er, ... 

d. derivational prefix =un-,... 

e. root = bag, event, cheese, board, happy, ... 

f. derivational suffix = -ful, -ness, ... 


We can use these word-structure rules to create complex words. In the 
following, we see the individual steps by which the words bags, unhappier 
and cheeseboard can be created using the rules in (3.18). 


(3.19) word-form stem + inflectional suffix (by 3.18a) 
stem root bag (by 3.18bi, 3.18e) 
inflectional suffix — -s (by 3.18c) 
word-form: bag-s 


(3.20) word-form — stem + inflectional suffix (by 3.18a) 


stem — derivational prefix + root (by 3.18bi) 
derivational prefix — un- (by 3.18d) 
root — happy (by 3.18e) 
inflectional suffix — -er (by 3.18c) 


stem: un-happy 
word-form: un-happi-er 


(3.21) word-form —> stem (by 3.18a) 
stem — stem + stem (by 3.18bii) 
stem —> root (by 3.18bi) 
root — cheese (by 3.18e) 
root — board (by 3.18e) 


stem: cheese-board 
word-form: cheese-board 
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There are several ways to modify this general approach. For example, 
many linguists argue that we should dispense with word-structure rules in 
(8.18) and put all the relevant information, including combinatory potential, 
into lexical entries. This parallels an argument within syntactic theory. 
Many syntacticians have called into question the need for phrase-structure 
rules like (3.16a-c), on the grounds that the same information is already 
contained in words' lexical entries, making the general rules redundant. 

In line with this approach, an alternative formalism to (3.18) is illustrated 
in (3.22). These lexical entries contain information on the pronunciation, 
properties and meaning of the morpheme. The pronunciation is given 
between slashes in phonetic transcription, the properties consist of the 
word-class (for roots) or the combinatory potential (for affixes), and a 
rough indication of the meaning is given in quotation marks. (Naturally, a 
lot more needs to be said on the semantics of morphemes, but the details 
can be ignored for present purposes; see Section 11.1.1 for some aspects of 
the semantics that are relevant to morphology.) 


(3.22) proposed lexical entries for some morphemes: 


a. bag b. -s c. happy d. un- 
/bæg/ /z/ / haepi/ / An/ 
N N— A —A 
‘bag’ ‘plural’ ‘happy’ ‘not’ 


When lexical entries of roots and affixes are enriched in this way, 
morphological description seems to reduce largely to the description 
of the lexical entries of morphemes. Concatenation becomes a property of 
the lexical entry itself, all but removing the distinction between rules and 
morphemes. 

Despite this difference, the core principle is the same in both (3.18) and 
(3.22): morphology consists of one basic type of lexical entry (morphemes) 
and one type of process that operates on those entries (concatenation). And 
by doing so, the morpheme-based model maximizes the formal similarity 
between morphology and syntax. 

The morpheme-based model raises two questions. First, what are the 
advantages and disadvantages, both empirical and theoretical, of reducing 
morphological structure to morpheme concatenation? Second, can this 
model account for base modification, reduplication and conversion, and if 
so, how? 

There are at least two good reasons to favour a theory of purely 
concatenative rules. First, as noted above, morpheme concatenation is 
the most common kind of morphological pattern cross-linguistically. By 
treating concatenation as the fundamental (or only) type of morphological 
rule, the morpheme-based model provides a natural explanation for this 
fact. Second, so far we have compared morphology to syntax by noting 
that the morpheme-based model treats morphological structure as a string 
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of morphemes in much the same way that syntax consists of a string of 
words. However, morphology and syntax are similar in other respects as 
well. Much of the data is too complicated to consider here, but we can focus 
on one basic principle: hierarchical structure. 

Consider the sentence We need more intelligent leaders. It consists of a 
string of words, but it also has internal structure that is not necessarily 
reflected in the linear order of words. We know this is true because this 
sentence can have two possible interpretations, either ^We need more 
leaders who are intelligent', or ^We need leaders who are more intelligent." 
Each interpretation corresponds to a different hierarchical structure. In the 
first interpretation, intelligent leaders forms a subgroup (called a syntactic 
constituent), which is modified by more. In the second interpretation, 
more intelligent forms a constituent, and collectively modifies leaders. This 
hierarchical structure is represented using the (simplified) formalism of 
syntactic tree diagrams in Figure 3.1. 


Interpretation 1 Interpretation 2 
we we 
need need 
more 
intelligent leaders more intelligent leaders 


Figure 3.1 Hierarchical structure in syntax 


Not all sentences have more than one possible interpretation, but all have 
hierarchical structure. 

Important here is that concatenative morphology also (arguably) has 
hierarchical structure, and words like undoable also have two possible 
interpretations: ‘unable to be done’ (do and -able form a constituent), or ‘able 
to be undone’ (un- and do are a constituent). These are represented in Figure 
3.2 using simplified morphological tree diagrams. 

Inasmuch as morphology and syntax exhibit fundamental similarities 
of this type, we can hypothesize that morphology and syntax operate 
according to shared principles due to some innate property of language 
architecture, and that this is perhaps the reason that concatenative patterns 
are more commonly found. If so, it is desirable to posit a model that 
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Interpretation 1 Interpretation 2 


Y 


do -able un- do -able 


Figure 3.2 Hierarchical structure in morphology 


maximizes the formal similarity between morphology and syntax. For these 
and other reasons, the morpheme-based, concatenation-only approach 
to morphological analysis has been popular. (See Chapter 7 for a more 
thorough and critical discussion of hierarchical structure in morphology 
and its relation to syntactic structure.) 

Still, there are some disadvantages to positing that concatenation is the 
only rule type. Most notably, base modification and conversion are difficult 
to accommodate. Consider again Albanian plural nouns, e.g. armik ‘enemy’, 
armig 'enemies'. In principle, a morpheme-based model can account for this 
pattern by positing lexical entries as in (3.23): 


(3.23) a. armik b. ‘plural’ 
/ armik/ /O/ 
N N— 
‘enemy’ ‘plural’ 


The entry in (3.23b) means that there is a suffix that contributes the 
meaning ‘plural’, but which has no phonological form. This is a zero affix 
(or zero expression). To get the right phonological form, the morpheme- 
based model must assume that the zero affix has some property that 
triggers palatalization on the final consonant of the root. This amounts 
to treating palatalization as morphologically conditioned allomorphy: the 
allomorph armiq is selected by the zero plural affix. This analysis allows 
us to avoid violating the fundamental principles of the morpheme-based 
model. Formally, it is the suffix in (3.23b), not the process of palatalization, 
that expresses plurality. 

While this is a possible analysis, it is not a very satisfactory one. Why 
should the zero plural affix trigger palatalization? There is no motivation 
for this. Moreover, in a language with several different base modification 
patterns, there would be several different zero affixes in the lexicon, each 
of which would trigger allomorphy differently. In short, zero morphemes 
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are ad hoc devices that are posited for no purpose other than to save the 
principle of a concatenation-only model. Inasmuch as base modification and 
conversion represent common morphological patterns, this is a significant 
flaw of the morpheme-based model. 


3.2.2 The word-based model 


In the word-based model, the fundamental significance of the word is 
emphasized and the relationship between complex words is captured 
not by splitting them up into parts and positing a rule of concatenation, 
but by formulating word-schemas that represent the features common to 
morphologically related words. 

For instance, the similarities among the English words bags, keys, gods, 
ribs, bones, gems (and of course many others) can be expressed in the word- 
schema in (3.24c). 


(3.24) a. Words: bags, keys, gods, ribs, bones, gems, ... 


b. Lexical entries for words 


/baegz/N / khijz/N / gadz/N /ribz/N 
‘bags’ ‘keys’ ‘gods’ ‘ribs’ 
c. Word-schema 
/Xz/N 
‘plurality of xs’ 


A word-schema is like the lexical entries in (3.24b) in that it contains 
information on pronunciation, syntactic properties and meaning. But a 
word-schema may additionally contain variables. In this way, it abstracts 
away from the differences between the related words and just expresses 
the common features. The schema in (3.24c) expresses the fact that all 
words in (3.24a,b) end in /z/, that they all denote a plurality of things and 
that they are all nouns (indicated by subscript N after the phonological 
representation). The phonological string preceding the /z/ is quite diverse 
and is thus replaced by the variable /X/. Likewise, semantically these 
words share nothing besides the plurality component, so again the semantic 
part of the schema contains a variable (‘x’). We will use the terms match and 
subsume for the relation between concrete words and the abstract schema: 
words match a schema, and a schema subsumes words (for example, the 
schema in (3.24c) subsumes the nouns in (3.24a, b) and many others, but not 
all English plural nouns match it; for instance, the plural feet does not match 
its phonological part). 

Crucially, a word-schema stands for complete words, not for individual 
morphemes in the sense of the morpheme-based model. The word-schema 
in (3.24c) is a generalization based on the lexical entries in (3.24b), which are 
themselves word-forms, not morphemes. When the word-schema contains 
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both a segmentable piece of sound and a corresponding meaning, as is 
true in (3.24c), we can call this a ‘morpheme’. However, it is important to 
remember that in a word-based model a morpheme is just a convenient 
term. It is simply one kind of schema among many, and has no special 
status. 

Now what makes word-schemas really significant for morphology is the 
fact that closely related schemas are connected to each other. A schema that 
subsumes a very similar set of words is given in (3.25c). 


(3.25) a. Words: bag, key, god, rib, bone, gem, ... 


b. Lexical entries 


/baeg/N / khij/N /gad/N /rib/N 
‘bag’ ‘key’ ‘god’ ‘rib’ 
c. Word-schema 
/X/N 
x 


The morphological relationship between these sets of words can now be 
represented in the morphological correspondence in (3.26). 


(3.26) /X/N /Xz/N 


P4 ‘plurality of xs’ 


e 


The double arrow means that, for some word matching the schema on 
the left, there is a corresponding word matching the schema on the right. 
Example (3.26) thus shows what a morphological rule looks like in the word- 
based model. The rule in (3.26) is the word-based equivalent of (3.22b). It 
says that plural nouns can be formed from singular nouns by suffixing /z/. 

Note that combinatory potential is represented here as well. In the 
correspondence, the left schema is marked as a noun. This indicates that 
the rule forming plurals with [z] requires a base that is a noun. In this way 
too, the word-based formalism of (3.26) is equivalent to the morpheme- 
based formalism in (3.22b), in which the lexical entry for the morpheme 
/-z/ contains the restriction that the suffix must attach to a noun. 

Unlike the morpheme-based model, the word-based model has no way 
of dispensing with morphological rules, and while the correspondence in 
(3.26) represents a rule of suffixation, there is nothing in the model that 
necessarily restricts morphological rules to concatenation. As with the 
morpheme-based model, this raises the question: What are the advantages 
and disadvantages of an approach that allows a wide variety of rules, and 
gives no special status to rules of concatenation? 

There are at least three significant advantages of the word-based 
model. First, the most striking is that non-concatenative patterns can be 
described with it quite naturally, whereas such phenomena are difficult to 
accommodate in morpheme-based models. As an example, (3.27b) shows 
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the correspondence for English noun-verb conversion of nouns denoting 
instruments. 


(3.27) a. hammer / hammerv, sawn / sawv, spoonw / spoonv, funnelw /funnelv, .. . 


b. |/X/N /X/v 
‘x (= an instrument)’ ‘use x (= an instrument)’ 


e 


Here the word-schema on the right differs from the schema on the left in 
word-class and meaning, but not in phonological form. Processes of base 
modification can also be easily described by elaborating the phonological 
variable somewhat. For instance, shortening in Hindi/Urdu can be 
represented as in (3.28b), where /V:/ stands for any long vowel. 


(3.28) a. ma:r- ‘kill’, mar- ‘die’ 


b. |[/XV:Y/v 
'A causes B to happen' 


o |/XVY/v 
'B happens' 


Reduplication is described by copying part of the phonological string in 
one of the word-schemas. (3.29b) shows the rule for the Somali duplifix -aC 
that we saw in (3.13) (here /C/ stands for any arbitrary consonant). 


(3.29) a. buug/buugag ‘book(s)’, fool/foolal 'face(s)', koob/koobab ‘cup(s)’, ... 


b. [/XC1/v /XC1aC1/N 


X ‘plurality of xs' 


e 


A second advantage is that the word-based model can explain how back- 
formations (like to babysit, which is historically derived from babysitter) are 
possible. In the morpheme-based model, it is quite puzzling that speakers 
should be able to create a verb babysit. English has many compounds 
denoting agents in which the first lexeme in the compound is interpreted 
as the object of an implicit action (babysitter, truck driver, window washer, 
etc.). English does not, however, have a productive rule combining a noun 
and a verb to create compounds like to babysit (*to bookread, *to fishcatch, *to 
truckdrive, *to windowwash). This causes a problem for the morpheme-based 
model because compounds can only be built up from component lexemes 
in this approach, and not derived directly from other compounds. But in the 
word-based model, the fact that babysit came from babysitter can be readily 
described. The noun babysitter matches two word-schemas simultaneously. 
First, it matches the nominal compound schema in (3.30), and everyone 
agrees that it was first created using this rule (i.e. as baby + sitter). 


(3.30) |/X/N| &  |/Y/N / XY /N 
"x y o 'a y that has to 
do with x’ 


(Note that for compounds, the left-hand side of the correspondence must 
consist of two word-schemas.) And, second, it matches the word-schema 
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of non-compound agent nouns given on the right in (3.31). (Here ‘dox’ 
represents a variable action meaning.) 


(3.31) | /X/v /Xor/N 
‘dox’ o 'a person who 
(habitually) doesx’ 


Crucially, the bidirectional arrow indicates that the correspondence in (3.31) 
is not directed. In addition to the creation of -er nouns from verbs (like bake 
— bak-er, write — writ-er, sin — sinn-er, etc.), this rule allows the creation of 
verbs from nouns containing -er that denote an agent of some sort (babysitter 
— babysit). This is a fundamental difference between the word-based model 
and the morpheme-based model. Clearly, the rule in (3.31) is much more 
productive from left to right than from right to left (e.g. one cannot form 
a verb to butch from butcher), but under what sort of circumstances a rule 
is productive or unproductive is a separate question that we return to in 
Chapter 6. 

Once a back-formed word has become a normal word of the language, 
it is synchronically indistinguishable from a non-derived word (thus, only 
historical linguists, but not other speakers of English, know that edit was 
back-formed from editor). From the perspective of the morpheme-based 
model, we might therefore argue that back-formations are solely a problem 
for language change, and that our theory can treat babysitter and editor as 
synchronically derived from to babysit and to edit, rather than the reverse. 
While this is true, it does not solve the problem because it still fails to 
explain how back-formation could ever arise in the first place. The word- 
based model has the advantage of being able to naturally explain this kind 
of historical development. 

Third, even some concatenative patterns cause problems for morpheme- 
based models but are easily described within a word-based model. Cross- 
formations are good examples of this. Consider the three sets of English 
words in (3.32). 


(3.32) attract attraction attractive 
suggest suggestion suggestive 
prohibit prohibition prohibitive 
elude — elusive 
insert insertion — 
discuss discussion — 

— illusion illusive 
— aggression aggressive 


In order to describe the relations between these three sets, we minimally 
need the two correspondences in (3.33a-b). (For the sake of simplicity, we 
use the spelling rather than the pronunciation in representing the affixes 
-ion and -ive here.) 
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(3.33) a. [/X/v b. 


‘dox’ 
But these two rules do not suffice, because there are pairs like illusion/illusive, 


aggression/aggressive that lack a corresponding verb (*aggress, *illude). This 
means that within the word-based model we also need the rule (3.33c). 


(3.33) c. |/Xion/N / Xive / ^ 
‘action of doingx’ ‘prone to doing,’ 


/Xion/N 
‘action of doingx’ 


/X/v 
‘dox’ 


/ Xive / A. 
‘prone to doingx* 


e e 


e 


Rules of the type in (3.33c) are cross-formations: a morphological rule 
in which both word-schemas in the correspondence exhibit a constant 
phonological element (Becker 1993a). 

Cross-formations are in no way unusual or uncommon. Consider the pairs 
of words in (3.34), which demonstrate cross-formation in compounding. 


(3.34) seasick airsick 
sealane airlane 
seafare airfare 
seaborne airborne 
seamanship ^ airmanship 
seaworthy airworthy 
seaman airman 


(Becker 1993a: 13-14) 


For the first few words one can still imagine that the sea and air compounds 
were created independently of each other — i.e. airsick from air + sick, without 
direct relation to the older word seasick, or airlane from air + lane, without 
direct relation to sealane. But for some of the others this seems very unlikely 
because the meaning is noncompositional: the meaning of the word-form 
is more than the sum of the meanings of the parts. A seaman is a low-ranking 
navy member, not any man with some relation to the sea, and similarly an 
airman is a low-ranking air force member. Thus, we are probably dealing 
with a rule as in (3.35). 


(3.35) |/seaX/ 
'an x having to do 
with sea travel’ 


/ airX / 
'an x having to do 
with air travel’ 


e 


We will see more examples of cross-formation when we discuss inflectional 
morphology in Chapter 8. 

Despite being concatenative, cross-formation patterns cannotbe described 
so easily in a purely morpheme-based model. Since the morpheme-based 
model usually assumes that complex words are not stored in the lexicon, 
it must posit that illusion and illusive are each derived from a root illude, 
with the suffixes having the lexical entries [/-ion/; N; V— ] and [/-ive/; 
A; V— ]. The problem lies in the fact that once illude is posited as the root, 
the morpheme-based model then faces difficulty explaining why the verb 
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illude does not exist. By allowing for a direct relationship between illusion 
and illusive, the word-based model avoids this problem. So in summary, the 
word-based model, by virtue of allowing a wide variety of rules, provides 
more satisfactory analyses of non-concatenative patterns, some results of 
language change such as back-formation, and even some concatenative 
patterns such as cross-formations. 

Of course, the word-based model has its own potential disadvantages. 
One of the most common criticisms of the word-based model is that it is 
not restrictive. Restrictiveness is an important feature of a morphological 
model because it allows us to make generalizations about what is possible 
and impossible in language. The word-based model allows morphological 
rules of virtually any type - including many that are not known to exist in 
any language. 

For example, in the well-known language game Pig Latin the basic 
principle is that if a word begins with one or more consonants, its Pig Latin 
equivalent moves those consonants to the end of the word and adds 'ay' 
([ej]). So, English book — Pig Latin ookbay, star — arstay, etc. And in one 
version of the rules, if the word begins with a vowel, the Pig Latin word 
adds ‘say’: orange — orangesay, apple — applesay, etc. 

We must remember that Pig Latin is a language game, and thus different 
in many respects from normal language. Morphological patterns in which 
sounds are moved from the beginning to the end of the word do not seem 
to exist in natural languages. The morphological pattern of metathesis does 
involve switching the order of sounds (see (3.9)), but it always involves 
‘local’ rearranging — the sounds must be next to each other. No known 
natural languages have the Pig-Latin type of 'long distance' rearranging. 

Many linguists argue that a good model of morphology will preclude this 
kind of long-distance rearranging of sounds. The morpheme-based model 
automatically excludes the Pig Latin pattern (local rearranging is also 
excluded), but nothing about the word-based model necessarily does. If 
re-arranging of sounds is generally allowed in the word-based model (e.g. 
in order to account for metathesis), some extra theoretical device would be 
needed to restrict long-distance rearranging (e.g. an ad hoc principle that 
the rearranging must be local). Inasmuch as this makes the word-based 
model more complicated than the morpheme-based model, many linguists 
would argue that non-restrictiveness is a significant disadvantage of the 
word-based model. 

A related concern is how frequently morphological patterns involve 
concatenation. As discussed above, one of the advantages of the morpheme- 
based model is that it can easily account for the fact that concatenative 
patterns are very common in the world's languages, since the model 
reduces the formal architecture to morphemes and concatenation. By 
contrast, the word-based model posits morphological rules that effect a 
wide variety of changes on the base, including concatenation, metathesis, 
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vowel lengthening, etc., and treats them as equal in type. This means that 
the word-based model loses all morphology-internal explanation for the 
dominance of concatenative patterns. 

Here, however, it may be possible to find an explanation outside of the 
morphological system. Concatenative patterns are more common than 
non-concatenative patterns in part because the language structures that 
are the historical sources of concatenative patterns are more common 
than the language structures that are the historical sources of many non- 
concatenative ones. 

By far the most common way in which new morphological patterns, 
and particularly concatenative patterns, arise is by coalescence of several 
formerly free syntactic elements. When the two elements that coalesce 
are content words, the result is a compound. When one of the coalescing 
elements is a semantically abstract element that mostly serves grammatical 
functions in the sentence, the result of the coalescence is an affixed word. 
Let us consider an example from Spanish, which has a future tense that is 
formed by adding the suffix -r to the stem, followed by a series of special 
person-number suffixes: 


(3.36) PRESENT TENSE FUTURE TENSE 
19G cant-o ‘I sing’ canta-r-é ‘I will sing’ 
2SG canta-s canta-r-ás 
3SG canta canta-r-á 
1PL canta-mos canta-r-émos 
2PL cantá-is canta-r-éis 
3PL canta-n canta-r-án 


Originally this was a syntactic pattern, involving the auxiliary verb habere 
‘have’ (Modern Spanish haber), which was combined with the infinitive 
to express obligation: habeo cantare or cantare habeo ‘I have to sing’. Then 
the meaning shifted from obligation to future, and the verb haber lost its 
freedom of position and came to occur only immediately after the main 
verb. As a result of phonological reduction, the infinitive lost its final -e 
(cantare became cantar) and the forms of the verb haber were shortened (he, 
has, ha, habemos, habéis, han). Finally, the infinitive and the forms of haber 
were fused together to form a set of single complex words: 


(3.37) cantar he »  cantaré 
cantar has > cantaras 
cantar ha > cantará 
cantar (hab)emos > cantarémos 
cantar (hab)éis >  cantaréis 
cantar han > cantarán 


Such grammaticalization changes are extremely common in languages, 
and the vast majority of all (non-compound) concatenative patterns seem 
ultimately to go back to such syntactic phrases with auxiliary words. 
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By contrast, many non-concatenative patterns historically began as 
phonological patterns. (The reader may have noticed that most of the base 
modification patterns in Section 3.1.2 were described using terms from 
phonology.) For example, the English verb house used to have a voiceless 
[s], like the noun house, but until Middle English times, all forms of the verb 
had a suffix that began with a vowel (hous[e]n, hous[e]th, housleld, etc.). In 
the noun the [s] was in word-final position already in Old English. At some 
point, [s] between vowels came be pronounced as [z] (a phonological rule 
of voicing). This, of course, affected the stem-final consonant in the verb, 
but did not affect the noun. Later, the suffixes disappeared from most of 
the verbal forms, causing the noun and the verb to be identical — except for 
the voicing of the final consonant. In this way, voicing became the (only) 
morphological marker distinguishing the verb house from the noun house 
(verb wreathe from noun wreath, thieve from thief, etc.). 

The point here is that voicing is a common phonological process, so there 
are many opportunities for voicing to develop into a morphological pattern, 
but still not nearly as many opportunities as there are for free words to 
become morphemes. There are even fewer phonological patterns thatinvolve 
switching the order of two sounds. As a result, there are historically very few 
opportunities for patterns like morphological metathesis to develop (Janda 
1984). Thus, one possibility is to explain the frequency of concatenative 
patterns by positing a theory of mostly or solely concatenative rules, as 
the morpheme-based model does. But another possibility is to explain the 
frequency of concatenation as the indirect result of some historical changes 
being more common than others (Bybee and Newman 1995). This is the 
preferred explanation within a word-based approach. Moreover, the same 
logic can be applied to the non-restrictiveness of the word-based model. 
While the word-based model can describe many morphological patterns 
that do not exist in natural languages, including Pig Latin-type long distance 
movement of sounds, perhaps this is not a problem if those patterns would 
never arise by historical processes. Since system-external explanation is a 
major goal of linguistic research, lack of restrictiveness is not necessarily a 
disadvantage of the word-based model. 

In the end, in these two approaches to rules we find a classic conflict 
between different goals of morphological research. The morpheme-based 
model is restrictive and captures similarities between morphology and 
syntax, but at the cost of empirical adequacy. The word-based model 
is more empirically adequate, but at a cost of lost restrictiveness. It is 
therefore not the case that one approach is inherently superior to the other. 
Morphologists (and morphology students) must decide for themselves 
which goals of research are most important. But inasmuch as the problem 
of non-restrictiveness is minimized by system-external factors, whereas the 
problem of empirical inadequacy is not, the word-based approach should 
perhaps be evaluated more favourably. 
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Summary of Chapter 3 


In addition to concatenative patterns (affixation and compounding), 
morphology includes a wide variety of non-concatenative patterns. 
These include conversion, reduplication, and base modification 
(palatalization, weakening, gemination, lengthening, shortening, 
tonal change, stress shift, voicing, subtraction, metathesis, etc.). 

Given these patterns, it is difficult to create a formal analysis of 
morphological structure that posits only concatenative rules similar to 
those found in syntax (the morpheme-based model). Any such theory 
would have to posit extensive zero affixes and unmotivated rules of 
allomorphy. The opposite view, that rules represent morphological 
correspondences between word-schemas (the word-based model), 
allows for a more straightforward explanation of both non- 
concatenative patterns and issues of analysis, such as back-formation 
and cross-formation. The word-based model is thus more empirically 
satisfactory. 

At the same time, one consequence is that the word-based model is 
capable of describing many kinds of morphological patterns that are 
not found in the world's languages. The morpheme-based model is 
more restrictive. Some linguists consider this a significant advantage 
of the morpheme-based model. However, a counterargument is 
that the word-based model does not need to be highly restrictive if 
unattested patterns are very unlikely to occur for reasons external 
to the morphological system (e.g. because of conditions in language 
change) The same argument helps explain why concatenative 
patterns are much more common than non-concatenative ones. 


Further reading 


An early version of the morpheme-based model was advocated by 
Bloomfield (1933). More recent morpheme-based models are Word Syntax 
(Selkirk 1982; Di Sciullo and Williams 1987; and Lieber 1992) and Distributed 
Morphology (Halle and Marantz 1993, 1994; and Harley and Noyer 1998). 

Versions of the word-based model, grouped together under the name 
of Word and Paradigm models, are advocated most strongly by Matthews 
(1972), Bybee (1985), Becker (1990), Anderson (1992), Bochner (1993), 
Aronoff (1994), and Stump (2001a), and in Dasgupta et al. (2000) and Singh 
and Starosta (2003). See also Aronoff (2007). The word-schema formalism 
used in this chapter is based especially on Becker (1990, 1993a,b) and 
Bochner (1993). 
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Psycholinguists have debated whether both morpheme-based rules 
and schema-like correspondences (often called associative processes) exist 
side-by-side. See Bybee and Slobin (1982), Skousen (1989), and Alegre and 
Gordon (1999). Derwing (1990) is a summary. 

Janda (1982, 1984) and Bybee and Newman (1995) argue for a role for 
historical explanation in morphological theory. Joseph (1998) provides 
an overview of the historical sources of morphological patterns, and 
Haspelmath (1995) considers morphological reanalysis. 


Comprehension exercises 


1. Which formal operation (or combination of operations) is involved in 
the following morphological patterns? 


a. Mbay (v= low tone, Y = high tone, V= mid tone) 


tétà ‘break’ tétā ‘break several times’ 
Binda ‘wrap’ binda ‘wrap several times’ 
riya ‘split’ riya ‘split several times’ 
(Keegan 1997: 40) 
b. Yimas 
manpa ‘crocodile’ manpawi ‘crocodiles’ 
kika ‘rat’ kikawi ‘rats’ 
yaka ‘black possum’  yakawi ‘black possums’ 
(Foley 1991: 129) 
c. Coptic 
kot ‘build’ ket ‘be built’ 
hop ‘hide’ hép ‘be hidden’ 
tom ‘shut’ tem ‘be shut 


(Layton 2000: 129) 


d. Hausa (Y = low tone, Y = high tone) 


búgàa ‘beat’ bubbugda “beat many times’ 

tdakaa = ‘step on’ tattaakda = ‘trample’ 

dannée ‘oppress’ daddannée ‘oppress (many (times))’ 

(Newman 2000: 424) 

e. Tagalog 

ibigay ‘give’ ibinigay ‘gave’ 

ipaglaba ‘wash (fory ipinaglaba ‘washed (fory 

ipambili ‘buy (withy ipinambili “bought (with) 
f. German 

finden ‘find’ gefunden ‘found’ 

singen ‘sing’ gesungen ‘sung’ 


binden ‘tie’ gebunden ‘tied’ 
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2. What would be the lexical entries of the following English morphemes 
(using the formalism of (3.22))? 


hear, -ing (as in he is playing, she is dancing, etc.), re- (as in to replay, to 
rewrite, etc.), good, -s (as in sells, knows, etc.) 


3. Formulate the morphological rule in the word-based format of (3.26) 
(ie. as a correspondence between word-schemas) for the following 
pairs of words (each standing for a large set of such pairs): 
warm — warmer 
happy — unhappy 
play — replay 
happy — happily 


4. Formulate the morphological rule for the following Tagalog lexeme 


pairs: 

buhay ‘life’ buhay ‘alive’ 
gutom ‘hunger’ gutóm ‘hungry’ 
takot ‘fear’ takót ‘afraid’ 
hába? ‘length’ haba? ‘long’ 
gálit ‘anger’ galít ‘angry’ 


5. The following pairs of English lexemes are related by cross-formation. 
Formulate the rule for them, analogous to (3.33c). 
astronomy astronomer 
philosophy philosopher 
ethnography ethnographer 


6. For French adjectives, linguists have often advocated an analysis in 
terms of subtraction: the masculine form is formed from the feminine 
form by subtracting the final consonant (Bloomfield 1933: 217): 
plat/platte ‘flat’ [pla/plat] 


laid/laide ‘ugly’ [le /led] 
long/longue ^ 'long' [16/16g] 
soul/soule ‘drunk’ [su/sul] 
gris/grise ‘grey’ [gri/griz] 


Why is this an attractive analysis? 


Exploratory exercise 


One issue that was not addressed in this chapter is whether morphological 
rules ever fail to apply. We implied that morphological rules apply to all 
and only the bases that meet the conditions for a given rule, and linguists 
generally strive to formulate rules for which this is true. But is it always 
possible to write rules that are this efficient? In other words, how common are 
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exceptions? The goals of this exercise are to practise writing morphological 
rules using the formalism of word-schemas and morphological correspon- 
dences, and to consider some of the issues that exceptions pose for an 
analysis of morphological structure. The methodology consists of classic 
morphological analysis. An optional last step adds a simple experimental 
component. Croatian is used for demonstration purposes, but the reader 
could choose to investigate virtually any language with significant 
inflectional morphology. 


Instructions 


Step 1: Select a language and morphological relationship to study. The best 
choices will be pairs of word-forms belonging to the same lexeme. One of 
the forms should also be more ‘basic’ than the other. For example, consider 
the following words from Croatian. 


(3.38) a. SINGULAR PLURAL b. SINGULAR PLURAL 
blog blogovi "blog(s) album albumi | 'album(s) 
dzep dzepovi 'pocket(s) biskup biskupi ^ 'bishop(sy 
film filmovi ‘film(s)’ dokument dokumenti ‘document(s)’ 
grad gradovi ‘city(ies)’ kamen kameni ^ 'stone(s) 
park parkovi "park(s) papir papiri ‘paper(sy’ 
vlak vlakovi "train(s) razgovor razgovori 'conversation(s)' 


Here we have chosen singular and plural nouns in the nominative form. 
(This data represents only masculine nouns with ‘hard’ stems. Croatian has 
other types of nouns as well, but they are not relevant here.) These meet 
both criteria: the word-forms are inflectionally related (e.g. blog and blogovi 
belong to the same lexeme), and the form of the singular in these examples 
is more basic than that of the plural. This indicates that it is reasonable to 
set up a morphological correspondence that derives plurals from singulars. 
It is also best to pick a morphological relationship that exhibits multiple 
morphological patterns. Here there are two patterns: the nouns in (3.38a) 
form the plural with —ovi, whereas the nouns in (3.38b) have -i in the plural. 
For reasons that will be apparent below, the more morphological patterns 
there are, the more interesting the data will be to analyze. 

Step 2: Gather examples. Build a long list of word-form pairs that express 
the chosen inflectional relationship (e.g. singular-plural). Be sure to include 
all of the relevant morphological patterns. (A good way to do this is to 
consult a dictionary that gives inflectional information, and record all of the 
relevant examples on every 10th (20th, 50th, etc.) page.) 

Step 3: Write morphological rules. Sort the data into groups according 
to morphological pattern. Write rules to describe each group. For example, 
the major generalization for Croatian is that words with only one syllable in 
the nominative singular (monosyllabic nouns) form the nominative plural 
with -ovi, whereas words with more than one syllable (polysyllabic nouns) 
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form nominative plural with -i. In (3.39), the symbol ‘o’ is used to represent 
a syllable, so ‘X means any string of sounds that is one syllable in length. 
‘X o means any string that is at least two syllables in length. 


(3.39) Croatian rule for plural formation (masculine hard-stem nouns) 


a. |/X./N > /Xovi/N 

“x (NOM SG)’ “x (NOM PL)’ 
b. |/X,,,/N > /Xi/N 

‘x (NOM SG)’ “x (NOM PL)’ 


Be as specific as possible about the relevant factors. The goal is for the 
schemas to subsume as many word-forms as possible that follow the rule, 
while excluding as many as possible that do not. For example, the rule in 
(3.39a) excludes all polysyllabic nouns, because only monosyllabic nouns 
match the left schema (linguists say that monosyllabic nouns ‘meet the 
conditions’ for the rule of plural formation). 

Formulating rules may not be easy. For example, a large data set of 
Croatian would also include the nouns in (3.40). 


(3.40) a. SINGULAR PLURAL b. SINGULAR PLURAL 
golub golubovi —'pigeon(s) cent centi ‘cent(s)’ 
jastreb jastrebovi “‘hawk(s)’ dan dani 'day(s) 
pramen pramenovi 'tuft(s), hair gost gosti ‘guest(s)’ 
lock(s)’ 


The problem is apparent: the singular word-forms in (3.40a) meet the 
conditions for the rule in (3.39b), but seem to undergo rule (3.39a). The 
examples in (3.40b) have the same problem with regard to the rule in (3.39a). 
Does this mean that the rules in (3.39) are incorrect? 

This is where type frequency can help. More than 93 per cent of 
monosyllabic nouns followed the pattern in (3.38a); fewer than 7 per cent 
followed the pattern in (3.40b). Polysyllabic nouns are similarly likely 
to follow the pattern in (3.38b), rather than (3.40a). This suggests that the 
generalization formalized by the rules in (3.39) is fundamentally correct. 
The examples in (3.40) are exceptions. Look for similar issues in your data. 
You may need to rewrite some rules, or write new rules, to cover all of 
the examples in your data set. There is no rigid formula for morphological 
analysis — linguists develop a feel for good analysis through practice. 

Step 4: Ponder the implications of any exceptions. Consider the following 
questions in the context of your analysis. 

1) How many exceptions can accrue before they no longer seem 
‘exceptional’, and instead seem to be a rule-governed pattern? There is 
no absolute right or wrong answer here. Consider your data, the Croatian 
data, or other examples. Identify factors that would lead you to conclude 
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that a given example is rule-governed, and factors that would lead you to 
conclude that it is truly idiosyncratic, and needs to be learned on an item- 
by-item basis. Explain your reasoning. 

2) Think about whether additional data might help decide whether a 
particular group of exceptions is, or is not, represented by a morphological 
rule. For instance, if exceptions do not follow any rules (i.e. if the relevant 
word-forms are truly anomalous), how would you expect new words to 
behave? If a series of new monosyllabic words were to enter Croatian, 
would you expect 93 per cent of these new words to form plurals with 
-ovi, and 7 per cent with -i? This would be a result that matches the type 
frequency of similar existing nouns. Would you expect all of the new words 
to have plurals formed with -ovi? Or would you expect something else? 
Would your predictions change if the exceptions were described by a rule, 
albeit one that does not apply very widely? Do new words in a language 
(e.g. ones borrowed from another language) necessarily follow the same 
morphological patterns as existing native words? Apply the same questions 
to your data. Try to explain why you expect one or the other result. 

Step 5 (optional): Test your predictions about new forms with native 
speakers. Make up a variety of hypothetical words of the language (non- 
words). Non-words follow the phonological rules of a language, and sound 
like they could be words, but are not. Your non-words should match the 
schema for the ‘basic’ form. Present native speakers with the non-words 
and ask them to produce the relevant derived form. For example, we might 
ask Croatian speakers to decide what the plural form would be, given the 
hypothetical singular nouns brag, glik, adret, bakral, mokilar, etc. Are your 
predictions correct? And in general, do speakers behave in the ways that 
your rules would suggest? 


Lexicon 


n this chapter we look more closely at morphemes, focusing on the 

following fundamental issue: Do speakers memorize entire complex 
word-forms (readable, reads, washable), their component morphemes (read, 
wash, -able, -s), or both? Another way to ask the same question is: What 
is the content of the lexicon? Remember that the lexicon is the linguist's 
term for the language user's mental dictionary. When a linguist says that 
something is listed in the lexicon, this means that it must be stored in 
speakers' memories (but linguists generally prefer the more abstract, less 
psychological-sounding terminology).! 

The content of the lexicon is an important issue for any theory of 
morphology because lexical items are the fundamental building blocks of 
morphological structure. They are the bases to which morphological rules 
apply. As such, our view of the lexicon affects our analysis of morphological 
structure in broad ways. If evidence points to the lexicon consisting 
primarily of morphemes, the rules that we write will operate on morpheme- 
based structures. And correspondingly, if evidence suggests that the lexicon 
consists primarily of words, the rules that we posit will be fundamentally 
word-based. The material in this chapter is thus complementary to the 
discussion in Chapter 3. 

Alllinguists agree that the lexicon must contain at least all the information 
that is not predictable from general rules. For instance, an English speaker's 
lexicon must contain the monomorphemic English verbs arrive, refuse, deny, 
and words showing extreme semantic peculiarities (e.g. awful, which is not 


A distinction is sometimes made between a lexicon and a mental lexicon, where the lexicon 
is a purely abstract tool of linguists to describe roots and affixes that does not necessarily 
correspond in any way to speakers' mental knowledge. The term mental lexicon is then 
used for the more psychological concept of a speaker's mental dictionary. However, we 
follow the view that linguists should strive to analyze language in ways that are plausible 
representations of speakers' knowledge, so we will continue to talk about the lexicon in 
terms of a (hypothetical) speaker, and not distinguish between these terms. 
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at all the same as ‘full of awe’). But does the lexicon additionally contain 
predictable information? For example, does an English speaker's lexicon 
contain the complex word-form helpful, despite the fact that this word is 
easily segmented into the morphemes help and -ful, and fully predictable 
from the meaning of these parts? Here, there is disagreement. 

We can identify three major positions. One possibility is that no regular 
complex words (like helpful) are stored in the lexicon. On this view, the 
lexicon contains, to the extent possible, just simple, monomorphemic 
elements, i.e. roots and affixes. Idiosyncratic complex words are also lexical 
entries, but virtually all complex words are created by rule, rather than 
being listed. This is a morpheme lexicon. It corresponds to the morpheme- 
based model. Another position takes exactly the opposite view: not just 
some, but all complex word-forms are included in the lexicon, whether 
they are predictable or idiosyncratic. This is a strict word-form lexicon. 
The third position is intermediary — it posits that word-forms, morphemes 
and derived stems are all potentially listed in the lexicon. Whether any 
particular word-form is listed depends on a variety of factors. Since word- 
forms still play the primary role in this approach, we call it a moderate 
word-form lexicon. Both the strict word-form lexicon and the moderate 
word-form lexicon are consistent with the word-based model from Chapter 
3. In the following sections we evaluate these hypotheses. 


4.1 A morpheme lexicon? 


Based on the apparent parallelism between sentences, morphemes and 
phonemes shown in Figure 4.1, we might assume that the basic units of 
the lexicon are morphemes. Just as language users do not memorize each 
sentence that they use, we can also hypothesize that language users do not 
generally memorize complex words. 


Camilla met an unfriendly chameleon 


syntax 
sentences consist of words 


Camilla} 'met| jan] | unfriendly} | chameleon 
morphology | 
words consist of morphemes 


un|-| friend | -| ly 
phonology | 
morphemes consist of phonemes 
f-r-e-n-d 


Figure 4.1: A simple picture 


This is an appealing hypothesis not only because of the parallels to syntax 
and phonology, but also because linguists seek to provide an elegant 
description of language structure. 
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In the context of the lexicon, elegance is often measured in terms of 
economy, and a morpheme lexicon is maximally economical. To see how 
this works, consider again the word-forms belonging to the Modern Greek 
noun lexeme FILOS ‘friend’ (repeated from (2.2)): 


(4.1) The paradigm of riLos 
SINGULAR PLURAL 


NOMINATIVE _filos fili 
ACCUSATIVE _filo filus 
GENITIVE filu filon 


Combined, these six word-forms contain seven unique morphemes (fil-, -os, 
-0, -u, -i, -us, and -on). It might therefore seem more efficient to store the 
individual complex word-forms: six versus seven lexical entries. However, 
there are of course other lexemes whose word-forms follow the same 
morphological patterns as FiLOs, including Kosmos ‘world’ (kósm-os, kósm-o, 
kósm-u, kósm-i, kósm-us, kósm-on), Fovos ‘fear’ (fóv-os, ...), GAMOS ‘marriage’ 
(gdm-os, ...), and skiLos ‘dog’ (skíl-os, ...), just to list a few. If all word-forms 
are directly listed in the lexicon, there are thirty lexical entries corresponding 
to just these five lexemes. However, if only the morphemes are listed, there 
are eleven corresponding lexical entries, because the suffixes do not need to 
be repeated: fil-, kósm-, fov-, gám-, skil-, -os, -o, -u, -i, -us and -on. Multiplied 
by thousands of Greek verb lexemes, the morpheme approach becomes 
much more economical than listing each individual word-form. 

Note that economical representation is not restricted to inflectionally 
related forms. The English word-forms read, reader and readable belong to 
the same word family (i.e. they are derivationally related), and the word- 
forms write, writer and writeable, and many other sets of words, are related 
in a parallel fashion. In principle, it is thus possible to posit a lexicon that is 
quite economical — one lexical entry per morpheme. 

Unfortunately, a morpheme lexicon is not usually as elegant in practice as 
it is in principle. There are several kinds of complications that force even a 
theory seeking a minimal lexicon to posit some word-forms as lexical entries. 
In fact, we have already anticipated two potential problems in preceding 
chapters — unpredictability of meaning and lack of morpheme segmentability. 

First, if the lexicon consists primarily of separate morphemes that are 
combined together to form words, the meaning of a complex word should 
be equal to the sum of the meanings of its component morphemes. Stated 
differently, the word should exhibit compositional meaning. But as 
we have already seen, this kind of direct relationship between form and 
meaning does not always occur, and derivational morphology presents a 
particular problem in this regard. A reader is not just any person who reads, 
but also a kind of textbook and the title of an academic job (in the British 
system). These last two meanings are not predictable from the meanings of 
read and -er individually; the meaning is non-compositional. This indicates 
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that reader (textbook) and Reader (British academic title) are probably 
represented in the lexicon as complex words, rather than according to the 
component morphemes. The hypothesis that the lexicon consists (almost) 
exclusively of morphemes thus faces the same practical problem that has 
led dictionary-makers to give one entry to each lexeme - the meaning 
of a derived lexeme is often more than the sum of the meanings of the 
component parts. Since many languages have a large number of derived 
lexemes with unpredictable meaning, there is correspondingly a significant 
problem for the hypothesis of a morpheme lexicon. 

Another potential problem for a morpheme lexicon is lack of morpheme 
segmentability. Here we briefly describe four types: base modification, 
cumulative expression, zero expression and empty morphs. 

We have already encountered examples of base modification in Chapter 
3. A familiar example, plurals in German, is repeated as (4.2). Some German 
plurals are formed by replacing a back vowel of the singular form by a front 
vowel ([u], [a:], [9] are replaced by [v], [e:], [e], spelled ii, à, 6). Although this 
is clearly a morphological pattern, because an aspect of form corresponds 
to an aspect of meaning, it is not possible to segment a proper morpheme 
meaning ‘plural’. 


(4.2) SINGULAR PLURAL 
Mutter ^ Mütter — 'mother(s) 
Vater Väter ‘father(s)’ 
Tochter Töchter ‘daughter(s)’ 
Nagel Nägel ^ 'nail(s) 


Second, when an affix expresses two different morphological meanings 
simultaneously, we have cumulative expression (also called fusion). For 
example, the Serbian noun ovca ‘sheep’ has the number and case forms 
shown in (4.3). 


(4.3) SINGULAR PLURAL 
NOMINATIVE ovc-a ovc-e 
ACCUSATIVE ovc-u ovc-e 
GENITIVE ovc-e ovac-a 
DATIVE ouc-i ovc-ama 
INSTRUMENTAL  OUC-Olf ovc-ama 
VOCATIVE 0UC-0 ouc-e 


Clearly, it is not possible to isolate separate singular or plural or nominative 
or accusative (etc.) morphemes. The suffixes that follow the stem 
ov(a)c- express number and case simultaneously, or, in the technical term of 
morphology, cumulatively. Cumulative or fused expression is most often 
illustrated with different inflectional meanings, but it is also possible for 
an inflectional meaning and a derivational meaning to be expressed cumu- 
latively. In Krongo, a language of Sudan, the derivational meaning ‘agent’ 
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and the inflectional meanings 'singular' and 'plural' are expressed in a single 
affix: cà-/cò- denotes 'agent/singular', and ka-/ko- denotes 'agent/plural'. 
(4.4) malin ‘theft’ ca-malin ‘thief’ ka-malin ‘thieves’ 
moto ‘work’ có-mótó ‘worker’ kò-mòtò ‘workers’ 
(Reh 1985: 157) 


A suppletive stem may also simultaneously express the base meaning 
and the grammatical meaning. Thus, English worse expresses the lexeme 
meaning ‘bad’ and the inflectional meaning ‘comparative’ in a cumulative 
way. Affixes and stems that cumulatively express two meanings that would 
be expected to be expressed separately are also called portmanteau morphs. 

A particularly important phenomenon that causes problems for 
segmentation is the existence of words in which a morphological meaning 
corresponds to no overt form, i.e. a zero affix (or zero expression). Two 
examples are given in (4.5) and (4.6). 


(4.5) Coptic 


Co-i ^my head" 

Co-k ‘your (M) head’ 
co ‘your (F) head’ 
Co- “his head’ 

C0-s "her head’ 


(Layton 2000: 69, 103) 
(4.6) Finnish 


oli-n ‘I was’ 
oli-t ‘you were’ 
oli "he/she was’ 


oli-mme | we were’ 
oli-tte 'you(PL) were’ 
oli-vat ‘they were’ 


Some morphologists have worked with the requirement that the segmenta- 
tion of words into morphemes must be exhaustive and all meanings must be 
assigned to a morpheme. If we adopt this requirement, then we are forced to 
posit zero morphemes here that have a meaning, but no form (so Finnish oli 
would really have the structure oli-O, where the morpheme Ø stands for the 
third person singular, and Coptic čð would formally have the structure c0-(2). 
But the requirement is not necessary, and alternatively one could say, for 
instance, that Finnish has no marker for the third person singular in verbs. To 
be sure, the practical difference between the affixation of an unpronounced 
element and no affixation at all is not great, but at a conceptual level the two 
approaches are substantially different. And it does seem to be the case that 
the latter is less far-fetched and cognitively more plausible. 

Finally, the opposite of zero affixes can also be found: apparent cases of 
morphemes that have form but no meaning (called empty morphs). For 
example, in Lezgian all nominal case-forms except for the absolutive case (i.e. 
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the most basic case) arguably contain a suffix that follows the noun stem and 
precedes the case suffix. In (4.7), four of Lezgian's sixteen cases are shown. 


(4.7) ABSOLUTIVE sew fil Rahim 
GENITIVE sew-re-n —fil-di-n Rahim-a-n 
DATIVE sew-re-z  fil-di-z Rahim-a-z 
SUBESSIVE sew-re-k — fil-di-k Rahim-a-k 

‘bear’ 'elephant' (male name) 


(Haspelmath 1993: 74—5) 


Under this analysis, the suffixes -re, -di and -a have no meaning, but they 
must be posited if we want to have a maximally general description. With 
the notion of an empty morph we can say that different nouns select different 
suppletive stem suffixes, but that the actual case suffixes that are affixed to 
the stem are uniform for all nouns. The alternative would be to say that the 
genitive suffix has several different suppletive allomorphs (-ren, -din, -an), 
the dative case has several different allomorphs (-rez, -diz, -az), and so on. But 
such a description would be inelegant, missing the obvious and exceptionless 
generalization that the non-absolutive case suffixes share an element. 

In all four of these examples we find a similar problem for morphological 
segmentation. We can identify a morphological pattern which applies to the 
word, but it is difficult or impossible to segment a proper morpheme. This 
has undesirable consequences for the hypothesis of a morpheme lexicon in at 
least two ways. First, in Section 1.1 we defined a morpheme as the smallest 
meaningful partofa linguistic expressionthat canbe identified by segmentation. 
However, empty morphs have no meaning, and zero affixes have no form and 
therefore cannot be segmented. So positing zero affixes and empty morphs 
allows for a maximally economical lexicon, but if we incorporate these devices, 
we must expand the idea of what counts as a ‘morpheme’ lexical entry. And 
allowing lexical entries that can have form but no meaning, or the reverse, 
greatly reduces the restrictiveness of the morpheme-based model. 

Second, non-segmentable morphological patterns may force the 
morphological system to become more complicated in other ways. For 
instance, in Tiv, some classes of words express the imperative with a high 
tone on the final syllable. This is thus a kind of base modification. (In this 
analysis, the low tones on non-final syllables are filled in by default.) 


(4.8) ROOT IMPERATIVE GLOSS 
kimbi kimbi ‘pay’ 
kangasa kangasd ‘chew cud’ 
de dé ‘leave’ 
gba gba ‘fall’ 
va vá ‘come’ 


(based on Abraham 1940: 29) 


Since this pattern is completely regular, it is possible here to argue that 
the lexicon contains only the root, and that the imperative is derived by 
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a rule of tone assignment. However, note that under such an analysis, the 
imperative meaning is carried by the rule, not by an affixal lexical entry. 
So some inflectional meanings are associated with lexical entries, and 
some with rules. It seems undesirable to complicate our description of 
the morphological system in this way, but this problem cannot be easily 
resolved if the primary goal is to maintain a maximally economical lexicon. 

In short, a morpheme lexicon seeks to minimize the information in the 
lexicon by subsuming as much information as possible under general 
principles of grammar, and including in the lexicon only that information 
that is unpredictable. This approach has the advantage of being highly 
elegant if (and only if) it is empirically adequate, and if a minimal lexicon 
does not lead to complications elsewhere in the morphological system. 
However, a morpheme lexicon often runs into one or both kinds of 
problems, depending upon the morphological patterns of a given language. 
There are thus quite a few problems that are faced by any attempt to make 
morphemes (in the sense of minimal morphological constituents) the 
cornerstone of morphological analysis and the basic unit of the lexicon. The 
major issues are summarized in Table 4.1. 


Problems for a morpheme lexicon Example 


Unpredictable meaning of derived Reader (British academic) does not 
lexemes mean read + -er 


Lack of morpheme segmentability 


Base modification German plurals, e.g. Mütter 
Cumulative expression Serbian noun paradigm, e.g. ovc-a 
Zero expression Finnish third person singular, e.g. oli 
Empty morphs Lezgian non-absolutive noun 


paradigm, e.g., sew-re-n 


Table 4.1 Problems for a morpheme lexicon: summary 


4.2 A strict word-form lexicon? 


An alternative hypothesis is that the lexicon consists entirely of word- 
forms, both simple and complex. This approach is free of the problems with 
a morpheme lexicon that we have identified. For example, since meaning 
is associated in the lexicon with word-forms, not with morphemes, the 
meaning of a word-form need not be equal to the combined meanings of its 
morphemes. And morpheme segmentability becomes a significantly lesser 
problem if morphemes are not the basic units of the lexicon. 
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A word-form lexicon also has a number of other advantages. For 
example, it helps to explain traits unique to morphology, such as lack of 
productivity. Morphological patterns that can be used to create new words 
are called productive. Both derivational and inflectional patterns are often 
productive. Thus, the German plural suffix -en (e.g. Fahrt 'trip', plural 
Fahrt-en ‘trips’) can create new words when it is applied to new bases such 
as loanwords (e.g. Box 'loudspeaker unit', borrowed from English box, in 
German receives the plural Box-en). True novel words are far less common 
than novel sentences, and most of the time we use words that we have used 
many times before. But, in principle, morphology is like syntax in that it 
may be productive. 

From this perspective, what is really remarkable about morphology is 
that morphological patterns may also be unproductive. For example, there 
are a number of English action nouns containing -al (some of which are 
listed in (4.9a)). As the hypothetical but unacceptable forms in (4.9b) show, 
there are many verbs to which this suffix cannot be applied. 


(4.9) a. refusal, revival, dismissal, upheaval, arrival, bestowal, denial, betrayal 
b. *repairal, *ignoral, *amusal, *belial, *debuggal 


But the crucial point is one that cannot be made by giving examples: the 
suffix -al cannot be used at all to form novel lexemes in English. The list of 
nouns formed with -al is fixed (it contains 35 nouns according to the OED), 
and no new nouns can be added to this list. 

The reason why languages may have unproductive morphological 
patterns is that complex words, like simple words, are stored in the lexicon. 
Since it cannot be predicted that these verbs have an action noun in -al, the 
lexicon contains the nouns arrival, refusal, denial, in addition to arrive, refuse, 
and deny. When English speakers use a noun like arrival, in all likelihood 
they simply retrieve it from their lexicon rather than constructing it on the 
fly. 

A variety of complex words must therefore be listed in the lexicon. 
At the very least, the list includes complex words for which a suffix is 
unproductive and thus unpredictable (like arrival), and those for which the 
meaning is unpredictable (like Reader in the British academic sense). Faced 
with these ‘exceptions’, we could conclude that there is overwhelming 
evidence in favour of word-based structure, that the word-based model is 
therefore superior to the morpheme-based model and that we do not need 
morphemes at all in morphology. This is the essence of a strict word-form 
lexicon, and a number of morphologists have drawn this conclusion. But at 
this stage a few words of caution seem in order. There are some apparent, 
and some real, problems with the hypothesis of a strict word-form lexicon. 

First, should we worry about the inherent lack of elegance in a theory that 
lists all words? The answer depends upon which goal(s) of morphological 
research we consider most important. In addition to elegant description, 
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in our morphological description we aim for cognitive realism, system- 
external explanation and restrictive architecture. There are indications 
that a word-form lexicon is more cognitively realistic than a morpheme 
lexicon. Speakers remember a word not only if it is unpredictable, but also 
if it is very frequent. This is a general feature of animal (including human) 
cognition: the more often something is encountered, the more easily it is 
remembered (for instance, the more often a pianist plays a piece, the sooner 
she or he will be able to play it by heart). This applies to words, whether 
predictable or unpredictable, as to anything else. This is thus a classic case 
in which different goals of morphological research lead to a conflict. And 
to the degree that cognitive realism is given greater priority than elegance, 
we need not be very concerned about the lack of elegance of a word- 
form lexicon. Of course, if we prioritize elegance, we would not find a 
word-form lexicon to be very satisfactory. 

Second, a common argument against a strict word-form lexicon relates 
to agglutinative languages such as Turkish. In Turkish, words can be quite 
long; see (4.10). According to one count, 2076 of Turkish words have at least 
five morphemes (Hankamer 1989), and it is possible (though certainly not 
common) for Turkish verbs to contain ten or more inflectional morphemes. 


(4.10) a. oku-r-sa-m 
read-AOR-COND-1SG 
"If I read...’ 


b. oku-mali-y-mis-iz 
read-NEC-be-REP.PST-1PL 
‘They say that we have to read.’ 


c. okü-ya-ma-yabil-ir-im 
read-POT-NEG-POT-AOR-1SG 
‘T might not be able to read.’ 
(Kornfilt 1997: 367-75) 


Moreover, the inflectional system contains dozens of verbal affixes. Not all 
can co-occur (e.g. a word can have only one subject agreement marker). 
Still, the combinatory possibilities entail that every verb root can appear in 
a very large number of word-forms - at least 2,000. Multiplied by thousands 
of verbs, it seems completely impossible to memorize all forms of all verbs 
that a speaker might want to use (Hankamer 1989). 

Certainly agglutinative languages present a challenge to claims of a strict 
word-form lexicon, but it is not clear whether languages like Turkish are 
really problematic. It may be possible to assume a weaker version of the 
word-form lexicon, according to which a speaker memorizes all word- 
forms that they have heard, or that they have heard a certain number of 
times. This issue remains to be investigated. 

A more significant problem is that there is some evidence that speakers 
themselves see words as consisting of morphemes. Some linguists have 
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claimed that morphological rules never make reference to word-internal 
structure, so there is no need to assume that words 'have structure' once 
they have been formed according to the rules. But this does not seem quite 
right. For one thing, allomorphy is often conditioned by the morphological 
structure of the base. For example, Dutch past participles are marked by 
the prefix ge- (e.g. spreken ‘speak’, ge-sproken 'spoken') unless the verb 
bears a derivational prefix such as be- (e.g. be-spreken 'discuss', be-sproken 
‘discussed’, not *ge-be-sproken). The Sanskrit converb is formed by the suffix 
-två if the verb has no prefix (e.g. ga-tud ‘having gone’, ni-toà ‘having led’), 
but by the suffix -ya if the verb has a prefix (e.g. @gam-ya ‘having come’, 
not *ā-ga-tvā; *pari-ņī-tvā) (Carstairs-McCarthy 1993). This pattern would be 
very difficult to describe if we think that speakers have no knowledge of 
which stems have prefixes. 

Finally, morphemes also seem to have relevance for phonology. For 
example, many languages have phonological morpheme structure 
conditions — ie. restrictions on the co-occurrence of sounds within a 
morpheme. For example, English allows combinations such as [t0] and [d0] 
in complex words like eighth and width, but not within a single morpheme. 
German allows syllable-final consonant clusters such as [rpsts] as in Herbst-s 
(genitive of Herbst ‘autumn’), but within a single morpheme four consonants 
(e.g. [rpst]) are the maximum. In addition, alternations may be sensitive to 
morpheme boundaries. Standard Northern Italian has an alternation in the 
pronunciation of s between [s] and [z], whereby [z] is chosen if the s occurs 
between vowels (e.g. casa [-z-] ‘house’) and [s] is chosen elsewhere (e.g. santo 
[s-] ‘saint’). However, if the s is morpheme-initial, it is pronounced [s] even 
if it occurs between vowels (e.g. asimmetrico [-s-] ‘asymmetric’, risocializzare 
‘resocialize’) (Baroni 2001). These phenomena, too, seem to require that we 
recognize morphemes as real entities. 

A strict word-form lexicon is thus faced with a number of problems, 
which are summarized in Table 4.2. These facts make the strict word-form 
lexicon hypothesis less than fully satisfactory. 


Problems for a strict word-form lexicon Example 


Some morphological patterns necessarily | Conditions on Dutch past 


refer to morpheme-based structures participle prefix ge- 

Some phonological patterns necessarily Italian asimmetrico pronounced 

refer to morpheme-based structures with [s], not [z], because of 
morpheme boundary 

Speakers are unlikely to memorize all Turkish 


word-forms in rich inflectional languages 


Table 4.2 Problems for a strict word-form lexicon: summary 
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4.3 Reconciling word-forms and morphemes 


Thankfully, the evidence for morphemes as real entities is compatible with 
a lexicon in which most lexical entries consist of (complex) words. This 
is because it is possible to treat a morpheme as a generalization based on 
word-forms in the lexicon. 

For instance, according to the hypothesis of a moderate word-form 
lexicon, both the complex lexemes in (4.11) and the morphemes in (4.12) 
can be lexical entries. The latter are descriptions of patterns found among 
words in the lexicon. 


(4.11) word lexical entries (Russian) 


a. |/ruka/,, b. |/ruku/, 
"^hand.NoM.sc' "hand.Acc.sc' 
c. [/riba/y d. |/ribu/,, 
‘fish.NOM.SG’ ‘fish. ACC.SG’ 
e. |/sestra/,, f. |/sestru/,, 


'sister. NOM.sG' ‘sister.ACC.SG’ 


(4.12) word-schema lexical entries (Russian) 


a. suffixes 
/Xa/y /Xu/y 
'X. NOM.SG' 'X.ACC.SG 
b. roots 
/ rukX/,. / ribX/,. / sestrX/,. 
‘hand’ ‘fish’ ‘sister’ 


Note that in (4.11), the lexical entries are presented using the word- 
schema formalism from Chapter 3. This formalism is useful because under 
the hypothesis of a moderate word-form lexicon, it is more accurate to say 
that a morphological pattern, rather than a morpheme, can be a lexical entry. 
In other words, ‘morpheme’ lexical entries need not be restricted to roots 
or affixes in a moderate word-form lexicon. We will often continue to talk 
about the contents of the lexicon in terms of morphemes for the sake of 
convenience, but it is important to remember that the same principles also 
apply to morphological patterns that are not as easily described in terms of 
morphemes. 

The primary difference between this approach and the strict word-form 
lexicon lies in the status of morphological patterns. In a moderate word-form 
lexicon, word-forms are still primary, but morphological patterns, including 
ones that we can identify as ‘morphemes’, are allowed a secondary role and 
they can be lexical entries. Under this view, many complex words will be 
listed in the lexicon, but some will be composed on the fly from component 
parts when needed. From here on, the terms word-form lexicon and word- 
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based model will be used to refer to this approach, rather than the more rigid 
word-form-only approach to the lexicon discussed above. 

In Section 3.2.2, where the word-based model was first discussed, it 
was taken for granted that both predictable complex word-forms and 
morphological patterns are lexical entries. In other words, that discussion 
already assumed the hypothesis of the moderate word-form lexicon, without 
calling it that or justifying it. We now look at evidence for this position. 

A moderate word-form lexicon faces an immediate challenge. If the 
lexicon consists of both word-forms and morphemes, in our description we 
must determine which words are directly stored in the lexicon, and which 
are composed on the fly from morphemes. While most morphologists agree 
that all simple and at least some complex words are listed in speakers' 
lexicons, it is difficult to say which complex words are listed and which 
ones are not. 

Part of the issue has to do with the fact that any language contains both 
words that are familiar to most speakers (such as mis-represent and global-ize 
in English) and words that are novel and were perhaps never used before 
(such as mis-transliterate and bagel-ize, two words that we have just made 
up). Morphologists refer to these two types of words as actual words and 
possible words (or usual and potential words). Thus, the set of words 
in a language is never quite fixed. Speakers have the capacity to create, 
and hearers can understand, an almost unlimited number of new words. 
Dictionaries can record only the actual words, but at any time a speaker 
may use a possible (but non-actual) word, and, if it is picked up by other 
speakers, it may join the set of actual words (thus, if the number of bagel 
shops in Europe continues to grow, people will perhaps start saying that 
Europe is being bagelized). Attested novel lexemes that were not observed 
before in the language are called neologisms, and neologisms that do 
not really catch on and are restricted to occasional occurrences are called 
occasionalisms (or nonce formations). Most occasionalisms are probably 
never recorded, and, even among those that are recorded, many disappear 
soon afterwards. For instance, in 1943 the new word deglamorize was 
Observed and recorded by a linguist, perhaps because it was used repeatedly 
around that time (Algeo 1991). But it seems that the word has not caught on 
and has not really become part of the English lexicon (even though the OED 
records it). Around the same time, the word decolonize arose. This word 
was more successful, and most English speakers nowadays know it. It has 
thus become a truly actual word of English. On a practical level, then, it is 
simply not possible to document which rare words have been encountered 
(and sufficiently well remembered) by which speakers. With a word-form 
lexicon (strict or moderate), we would be forced to decide how established 
a word must be in the language to be memorized by speakers. 

Additionally, and more importantly many linguists argue that the 
hypothesis of a moderate word-form lexicon also faces a complex challenge 
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related to how speakers recall words stored in the lexicon. Imagine that the 
language user has just heard the word insane. If both morphemes and word- 
forms are stored in the lexicon, that person can retrieve the meaning of the 
word from his mental lexicon in two different ways - either by breaking 
the word into its component morphemes and looking each up (in-, sane), 
or by looking up the word-form directly (insane), assuming it is stored. 
The process of looking up a word in the lexicon is known as lexical access. 
When lexical access occurs by breaking up words into morphemes, this is 
the (morphological) decomposition route; retrieving complex word-forms 
without decomposition is the direct route. 

Which method is a language user more likely to employ? Some linguists 
argue that the answer is, in some sense, both. There is evidence that when 
speakers need to retrieve a word from the lexicon, they try both routes 
simultaneously. The ^winner' is whichever method is faster in accessing 
the information. Lexical access is thus a kind of race which exists to make 
the mental work of processing language more efficient. This postulation 
is represented in Figure 4.2. The solid lines indicate lexical access via the 
decomposition route. The dashed line indicates lexical access via the direct 
route. (We will return to the thickness of the circles in the discussion below.) 


- 


^... insane...” 
Figure 4.2: A schematized dual-route model of lexical access (Hay 2001: 1045) 
Given that the decomposition route and the direct route compete to 


retrieve words, a natural question is: What determines which route wins? 
There seems to be a variety of relevant factors; here we discuss three: 
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frequency, morpheme segmentability and allomorphy. These factors may 
all support the same route, or they may conflict. Lexical access, and by 
extension also the contents of the lexicon, are thus quite complicated.? 

First, there is a general consensus among linguists that speakers have 
detailed (subconscious) knowledge of how frequently a word is used. 
Metaphorically, this can be thought of as a representation of how well a 
person remembers a word. In a basic sense, lexical entries (both word- 
forms and morphemes) that are used more frequently — ie. items that 
have a higher token frequency - are more firmly established in a person's 
memory, and have a stronger representation in the lexicon. We will say 
that these words have greater memory strength. Those that are used less 
have less memory strength. In Figure 4.2 above, the thickness of the circles 
represents token frequency (thicker = more frequently used = greater 
memory strength). 

Frequently used words can be accessed more quickly. As a consequence, 
token frequency strongly influences the representation of complex words 
in the lexicon: if the word-form has a higher frequency than its base (all 
other factors being equal), it is more likely to be accessed via the direct 
route (Hay 2001). This is the scenario represented in Figure 4.2. In this case, 
the complex word insane is more common in English than the word sane. 
According to one count, the word-form sane is used about 8 times in every 
1 million words of written text. Insane is used 14 times per million words 
(CELEX English database, Baayen et al. 1995). This difference leads insane 
to have greater memory strength, and the direct route — represented by the 
dashed line - wins. Conversely, whenever the root is more common (again, 
all else equal), decomposition is more likely. 

The second factor affecting whether a complex word is stored in the 
lexicon is segmentability. Complex words with segmentable affixes are 
intuitively more likely to be stored according to these affixes. Words that 
display morphological patterns which are less segmentable, such as base 
modification, are more likely to be stored as word-forms. 

Third, affixes thatinduce allomorphy inthe base to which the affix attaches 
are less likely to be decomposed into morphemes than are affixes which do 
not cause allomorphy in the base. Examples of phonological allomorphy 
induced by the English suffix -ity are given in (4.13a). The final [k] in electric 
changes to an [s] in electricity, and the stress changes syllables. In divinity, 
the vowel changes quality. By contrast in the (b) examples, adding -ship to a 
base does not change its phonological shape. 


It should be noted that lexical access is different from lexical content. This means that how 
a lexical entry is accessed is not exactly the same as what is listed in the lexicon. In this 
chapter we are interested in the latter issue. Nonetheless, the two are closely related, so we 
will assume that the same factors are important for both. 
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(4.13) a. base allomorphy induced by -ity: electric — electricity, divine — 
divinity 
b. nobaseallomorphy induced by -ship: ambassador —ambassadorship, 
representative — representativeship 


Wecan perhaps think of these last two issues in terms of the saliency of the 
internal morphemic structure of a complex word-form. When morphemes 
are segmentable and do nothave multiple allomorphs, the internal structure 
may be more salient because the boundaries between morphemes are easier 
to identify. A summary of all three factors is given in Table 4.3. 


Factors Direction of influence 


Relative token frequency More frequent  word-form storage 

of word-form 

Segmentability Less segmentable — word-form storage 
Allomorphy More effect on base  word-form storage 


Table 4.3 Three factors influencing lexical storage: summary 


Finally, an important issue is whether decomposition entails full 
decomposition. The discussion above considered only words with two 
morphemes, but of course, words can have more than two morphemes, 
e.g. insanely (int+sane+ly). It is unlikely that insanely is directly listed in the 
lexicon because insanely is used much less frequently than is insane (less 
than one time in every one million words), and all three morphemes are 
easily segmented. But it is not entirely clear how insanely is decomposed. 
Is it stored in the lexicon with three separate lexical entries (in-, sane, -ly), 
or as two (insane, -ly)? There is some evidence that the latter is probably 
correct in this case, based primarily on the relative frequency of sane and 
insane. Decomposition thus does not necessarily entail full decomposition 
into component morphemes. 

Ultimately, the structure of the lexicon is a very active area of research, but 
the evidence increasingly suggests that while both morphemes and words 
are listed, the balance is easily tipped in favour of word-based storage. 
This suggests that the lexicon, and thus morphology more generally, are 
fundamentally word-based. While a moderate word-form lexicon is not 
very economical, research to date suggests that it is the most cognitively 
realistic of the three proposals. 
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Summary of Chapter 4 


A language user's mental dictionary is the lexicon. A major theoretical 
issue is whether the lexicon consists of morphemes or word-forms. 
This is important because lexical entries are the fundamental building 
blocks of morphological structure. There are several problems with the 
hypothesis of a morpheme lexicon, including the non-compositional 
meaning of many derived lexemes, and problems breaking words 
into morphemes, including base modification patterns, cumulative 
expression, zero affixes and empty morphs. Still, there are also 
problems with the strict word-form lexicon hypothesis, for example, 
some types of morphological rules seem to rely on the concept of 
the morpheme, and morphemes may have a real status for speakers. 
Combined with factors (e.g. frequency) that seem to promote 
decompositional route (morpheme-based) lexical access under some 
conditions and direct route (word-based) lexical access under others, 
the best conclusion is that morphological structure is fundamentally 
word-based, but morphemes (or more properly, morphological 
patterns) represent secondary generalizations. Word-forms, 
morphemes and derived stems are all stored in the lexicon. 


Further reading 


Classic works on the lexicon and lexical access include Butterworth (1983) 
and Taft and Forster (1975). The papers in Feldman (1995) and Baayen and 
Schreuder (2003) discuss a wide variety of issues related to morphological 
processing, and the papers in Jarema and Libben (2007) provide current 
perspectives on the structure of the lexicon from a psycholinguistic perspective. 

For details of a parallel dual-route model, see Schreuder and Baayen 
(1999). Also, Hay (2003) demonstrates the role of relative frequency, 
Jarvikivi et al. (2006) look at allomorphy, and Bertram et al. (2000) discuss 
more generally the complex interaction of factors which promote or inhibit 
morphological decomposition. 

Pinker (1991) and Clahsen et al. (1997) represent an opposing view that 
words with irregular inflection are accessed as whole word-forms, but 
regularly inflected words are always decomposed. Similarly, Caramazza et al. 
(1988) and Taft (1994) give greater weight to morphological decomposition. 

Non-psycholinguistic models of morphology are also divided over 
whether morphemes, stems, or whole words are the fundamental units of 
morphological structure. See the Further reading section in Chapter 3 for 
references. 
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Comprehension exercises 


1. 


Which of the following English words are actual, possible and impossible? 


replay, libertarian, itinerance, reknow, fraternitarian, penchance, rebagelize, 
abundance, happytarian 


Which of the following words have compositional meaning, and which 
have non-compositional meaning? 


a. ability, popularity, community, morality, authority 
b. materialize, modernize, legalize, vaporize, specialize 


Look again at the words in question 2. According to the hypothesis 
of a moderate word-form lexicon, the words in (a) are more likely to 
be stored as whole word-forms than the words in (b). Why? What is 
the most relevant factor that distinguishes the two groups? (Ignore 
frequency, since the necessary information is not provided.) 


For each of the following languages, determine whether the examples 
exhibit cumulative expression, empty morphs or zero expression. (Some 
may exhibit more than one of these features.) Explain your answers. 


a. Finnish pronouns (partial paradigm) 


1ST P. PL 2NDP.PL 3RDP.PL 
NOM me ‘we’ te ‘you’ he ‘they’ 
GEN meidän teidän heidän 
PAR meitä teitä heitä 
ESS meinä teinä heinä 
INESS meissä teissä heissä 
ELA meistä teistä heistä 


b. Ndebele imperative verbs 
ROOT IMPERATIVE GLOSS 


lim- lima ‘cultivate! 
nambith- nambitha ‘taste!’ 

dl- yidla ‘eat!’ 

m- yima ‘stand!’ 

Z- yiza ‘come! 
lw- yilwa ‘fight!’ 


(Inkelas and Zoll 2000: 5) 


€. Serbian present tense verbs: GOVORITI ‘to speak, say’ and TRESTI ‘to 
shake’ 
SINGULAR PLURAL 
IST PERSON govorim govorimo 
2ND PERSON govoriš govorite 
3RD PERSON govori govore 
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SINGULAR PLURAL 


1ST PERSON tresem tresemo 
2ND PERSON treseš tresete 
3RD PERSON trese tresu 


Exploratory exercise 


In this chapter we argued that frequency has an effect on whether a complex 
word is directly given in the lexicon. But we did not ask two important 
methodological questions: if it is possible for a word-form like insane to 
be stored in the lexicon either via morphemes or directly, how can we, as 
researchers, know how the word is stored? And how can we know that 
frequency is an important factor? After all, we cannot directly observe 
the contents of the lexicon. There are a handful of methods for testing the 
content of the lexicon, and factors related to it. This research exercise has 
two purposes: first, to introduce one of these methods, and second, to test 
the hypothesized relationship between frequency and lexical storage that 
was presented in this chapter. 

The basic methodology involves asking speakers to judge how related 
two words are to each other in meaning. For target items (i.e. the ones that 
are the focus of interest), one of the words is complex and the other is its 
base. The assumption underpinning this task is that speakers should judge 
complex words that are stored as morphemes as being closely related in 
meaning to their bases because the base and the complex word share a 
lexical entry. Thus, for example, if insane is stored in the lexicon according 
to the morphemes in- and sane, the meaning of the entire word should 
depend closely on the meaning of sane. By contrast, a complex word that is 
stored as its own lexical entry may be judged as less close in meaning to its 
root because the two are formally distinct in the lexicon. By manipulating 
the frequency of the complex word, we can test for a correlation between 
frequency and meaning closeness. And under the assumption that meaning 
closeness reflects lexical storage, a correlation (or lack thereof) should 
indicate whether frequency influences lexical storage. 

The instructions below use English for demonstration purposes because 
it is a language that all readers are familiar with. However, this exercise 
could be conducted using virtually any language. 


Instructions 


Step 1: Choose an affix to study. The best choices will be ones that (a) 
frequently attach to monomorphemic bases, as opposed to only attaching 
to already-derived stems, and (b) create a new lexeme rather than a word- 
form of the same lexeme (see Section 2.1 for this distinction). For example, 
for English we might choose to study the suffix -ity (obscurity, immensity, etc.). 
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Step 2: Create words to be used in the experiment. Make a list of words 
containing the affix that you chose in Step 1. Also use a frequency dictionary 
(i.e. a dictionary that gives counts of how frequently a particular word-form 
is used) or a written or spoken corpus of the language to gather information 
about the token frequency of these words. Then, sort them into ‘frequent’ 
and 'infrequent' groups. (You may want to remove words of intermediate 
frequency, to create two maximally distinct frequency groups.) A few examples 
for -ity are given below. Of course, you will need a longer list of words. 


Complex word Freq/Infreq Root 
obscurity infrequent obscure 
acidity infrequent acid 
modernity infrequent modern 
opportunity frequent opportune 
priority frequent prior 
security frequent secure 


Table 4.4 Exploratory exercise: possible stimuli 


Tips for finding words: For some languages, online dictionaries can 
be searched to find all words that contain some sequence of letters in a 
particular word position (e.g. the letters ity at the end of the word). Look 
for ones that allow wildcard searching. Some languages also have freely 
available online corpora that can be used in the same way. Also, if you have 
chosen a suffix rather than a prefix, find out whether the language has a 
reverse dictionary. In a reverse dictionary, entries are alphabetized from the 
end of the word to the beginning, rather than the usual beginning-to-end 
method. This groups all words with a given suffix together (assuming the 
word ends with the suffix), making examples easier to find. 

Next, add a variety of ‘filler’ words. The fillers should be derived lexemes 
that range in semantic similarity to their roots, but do not contain the affix 
that you chose in Step 1. For example, at one end of the scale might be 
words like helpful, which is very similar in meaning to help, at the other 
end of the scale words like awful, which is not at all similar to awe, and 
in the middle words like artful. You might even want to include ‘false’ 
derived lexemes. For example, defend looks like it can be broken into two 
morphemes (de-fend) because de- is sometimes a prefix (e.g. de-regulate) and 
fend is a real word, but this would be a false segmentation. The fillers are 
included primarily to create a wide range of pairs of words. They also help 
to distract the speaker from the purpose of the study, namely studying the 
effects of frequency. (People do strange things when they think they know 
what you want them to say!) 
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Write each of the target words and its root on an index card, with the 
complex word first. For example, one card might be obscurity-obscure. Do 
the same for the filler words. 

Alternative procedure: Hay (2001) argues that relative frequency is more 
important for lexical access than absolute frequency. In other words, as far 
as the lexicon is concerned, a frequent complex lexeme is one that is more 
frequent than its base, regardless of how often it is used in absolute terms. 
An infrequent complex lexeme is one that is less frequent than its base. As 
an alternative in Step 3, choose words that are frequent or infrequent in this 
relative sense. (Warning: This is much more difficult!) 

Step 3: Review the discussion in Section 4.3 about frequency. This 
discussion represents the hypothesis about the relationship between 
frequency and word-form storage in the lexicon. Based on this discussion, 
develop specific predictions. Predictions are expectations about how the 
data will turn out if the hypothesis is correct. Based on what you have read, 
how do you expect frequency of the complex word to be related to speakers’ 
judgements about semantic relatedness of a complex word and its root? 
For example, do you expect the words to be judged to be further apart in 
meaning when the complex word is frequent? Do you expect the opposite? 
Or no relationship between the frequency and meaning closeness? Explain 
the rationale behind your prediction. Remember to also consider what data 
you would expect to find if the hypothesis is not true. 

Step 4: Run the study. Find native speakers of the language who are 
willing to participate in the project. Several participants is ideal, but even 
a few people can produce interesting results. Present them with each card. 
Ask them to ‘Rate how much the meaning of the first word is related to the 
meaning of the second word on a scale from 1 to 7 in which 7 means "very 
related" and 1 means "not at all related".' Write down the rating for each 
card and each speaker. Hint: The target word cards should be presented in 
random order, with fillers mixed in. However, it helps if the first 5-10 cards 
are filler pairs representing different points on the scale. This allows the 
subject to practise and calibrate her/his judgements. 

Step 5: Analyze the data. Determine whether ratings differ depending 
upon the frequency of the complex word. Was your prediction upheld? 
At the same time, also consider factors that are not directly related to the 
research questions but may have affected the results. For example, did 
all speakers give similar ratings? Did all frequent (or infrequent) target 
words produce similar ratings? If not, this may suggest other issues that 
the study did not take into consideration. Also think about the impact of 
the methodology. For example, is semantic transparency a good measure of 
word-based vs. morpheme-based listing? Why or why not? Will a complex 
word that has its own lexical entry always lead speakers to assign lower 
semantic relatedness ratings? Why or why not? 

Step 6: Draw conclusions. If your prediction was upheld, does this 
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suggest that the hypothesis is correct? Why or why not? If your prediction 
was not upheld, what does this suggest? In short, what do the data indicate 
about the relationship between frequency, semantic transparency and the 
content of the lexicon? 


Inflection and 
derivation 


n this chapter, we discuss inflection and derivation in greater depth. As 

we saw in Section 2.1, this conceptual distinction is quite basic to most 
morphological theorizing and terminology, though it is not always easy 
to determine the relation between two word-forms: does nicely belong to 
a separate lexeme from nice, or are both word-forms in the paradigm of 
NICE? In other words, is the suffix -ly that is attached to nice to form nicely a 
derivational suffix or an inflectional suffix? 

We survey inflectional functions in Section 5.1 and derivational meanings 
in Section 5.2. In Section 5.3 we examine a range of properties that have 
been proposed as distinguishing between inflection and derivation, 
and between two subtypes of inflection. Section 5.4 gives an overview 
of the ways in which the relation between inflection and derivation has 
been conceptualized by morphologists. The two most important views 
are the dichotomy approach, which argues that complex words can be 
neatly divided into two disjoint classes, and the continuum approach, 
which claims that morphological patterns are best understood as lying 
on a continuum ranging from the most clearly inflectional patterns to the 
most clearly derivational patterns. Finally, in Section 5.5 we briefly show 
some implications of these views for how linguists model the relationship 
between morphology and syntax. 


5.] Inflectional values 


Morphologists usually talk in quite different terms about inflection and 
derivation. For instance, the different inflectional formations are referred to 
as expressing inflectional values (or inflectional feature values), so we say, 
for instance, that English verbs express the inflectional values ‘present’ (e.g. 
(he/she) walks) and ‘past’ (e.g. (he/she) walked). But for derived lexemes like 
walker we would not normally say that it represents a 'derivational value' 
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(‘agent noun’) - instead we simply talk about derivational meanings. One 
reason for this distinction is that inflectional values often do not have a 
clearly identifiable meaning, only a syntactic function. For example, walks 
differs from walk in that walks is used when the subject is third person 
singular (she, he, it) and walk is used with other subjects (I, you, we, they), 
but many linguists feel uncomfortable calling this a difference of meaning 
because it is quite abstract. 

Different languages vary quite dramatically in the amount of inflectional 
complexity that their words exhibit. Some languages, such as Vietnamese 
and Igbo, a language of Nigeria, have no (or virtually no) inflectional 
values, and others have inflection for more than a dozen values (though 
it is uncommon for a single word-form to be inflected for more than half a 
dozen values). However, despite all this diversity, the types of inflectional 
values that we find across languages are surprisingly uniform. Perhaps 
more than two-thirds of all inflectional values fall into one of the classes of 
Table 5.1. 


On nouns, pronouns On verbs On adjectives, 
demonstratives, relative 
pronouns, adpositions 


number number number 
(SINGULAR, PLURAL,...) (SINGULAR, PLURAL,...) (SINGULAR, PLURAL,...) 


case person case 
(NOMINATIVE, (1ST, 2ND, 3RD) (NOMINATIVE, 
ACCUSATIVE,...) ACCUSATIVE,...) 
gender tense gender 
(MASCULINE, (PRESENT, FUTURE, (MASCULINE, 
FEMININE,...) PAST, ...) FEMININE,...) 
person aspect person 
(1ST, 2ND, 3RD) (PERFECTIVE, (1ST, 2ND, 3RD) 
IMPERFECTIVE, 


HABITUAL, ...) 


mood 
(INDICATIVE, 
SUBJUNCTIVE, 
IMPERATIVE,...) 


Table 5.1 Common inflectional features and values 


As the organization of Table 5.1 suggests, inflectional values are 
often naturally grouped together into super-categories that we will call 
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inflectional features.! Two values belong to the same feature if they share 
a semantic (or more generally, functional) property and are mutually 
exclusive. For instance, the English present and past tenses both have to do 
with when an event happens, relative to the moment of speaking (so they 
share a semantic property), and they cannot occur together in the same verb 
(they are mutually exclusive). Thus, they are values of the feature ‘tense’. 
We have already seen number and case inflection of nouns, e.g. in Latin 
(repeated as Figure 5.1)? The number feature is self-evident; it indicates 
quantity. Case indicates the semantic and syntactic role of a noun in a 
sentence. A given case may express many roles, but one is usually considered 
basic. Among the Latin cases, nominative canonically marks subjects and is 
the citation form, accusative marks direct objects, and dative marks indirect 
objects. Genitive canonically indicates the possessor ('s in student's book is 
a genitive marker and one of the few case values in English), and ablative 


means ‘movement away from’. 


P opa ES 


SINGULAR | PLURAL 
NOMINATIVE | insula insulae 
ACCUSATIVE | insulam insulüs 
C ase X GENITIVE insulae insularum 
DATIVE insulae insulis 
ABLATIVE insula insulis 


Figure 5.1 Case and number in Latin 


Latin is a fairly typical language in terms of number: most languages mark 
singular and plural on nouns. Fewer distinguish a dual number, and 
even fewer a paucal number (paucal means ‘a few’). Languages vary in 
the number of morphological cases they express; Latin has five cases, but 
many languages have no case distinctions at all, and a few have more than 
ten different cases. Typically, inflectional values of number and inflectional 
values of case combine freely, as shown. 


' Some linguists use the term inflectional category for our inflectional feature, and inflectional 


property for our inflectional (feature) value. 


? See the Appendix of this chapter for notation conventions related to inflectional values. 
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Person distinctions are widely marked on verbs in the world's languages, 
but only in a limited way in English (and not even all dialects of English). 
The only verb to fully mark person values is be; there are distinct forms of be 
depending upon whether the subject refers to the speaker (first person), the 
addressee (second person) or a third party (third person), i.e. (I) am, (you) 
are, (he/she/it) is. In some languages, the verb marks person according to the 
value of the object, rather than the subject, or according to both (see (5.6) in 
Section 5.3.1 for an example from Yimas). 

The features tense, aspect and mood exist to some extent in virtually 
all languages that have any inflection at all. Tense indicates the temporal 
location of a verb's action (past, present, future). Aspect has to do with the 
internal temporal constituency of an event, for example, whether the action is 
viewed as completed (perfective), non-completed (imperfective), habitual, 
etc. Finally, mood denotes the certainty, desirability or conditionality of 
an event. It subsumes a wider range of inflectional values, including the 
imperative (commands), subjunctive (non-realized events) and indicative 
(events viewed as objective facts). 

The three feature names ‘tense’, ‘aspect’ and ‘mood’ suggest that values 
from these different features can be combined in the same way that case 
and number, or person and number, can be freely combined. Indeed, this 
is sometimes possible, for instance, in Latin, which has three tense values 
(present, past, future), two aspect values (infectum and perfectum; the 
latter is similar to the English perfect) and two mood values (indicative and 
subjunctive). See Figure 5.2. 


INDICATIVE SUBJUNCTIVE 


PAST canta-ba-t | canta-v-era-t PAST canta-re-t | canta-v-isse-t 
FUTURE | canta-bi-t | canta-v-eri-t FUTURE | — = 


INFECTUM PERFECTUM INFECTUM PERFECTUM 
PRESENT | canta-t canta-v-it PRESENT | cant-e-t | canta-v-eri-t 


Figure 5.2 Latin tense, aspect and mood forms (third person singular) 


However, the Latin system does not have all possible combinations: there 
are no future subjunctive forms. Moreover, this system is quite atypical 
in being as symmetrical as it is. In most languages, different inflectional 
values for tense, aspect and mood are difficult to combine. A language that 
contrasts with Latin in this respect is Swahili, where tense, aspect and mood 
are expressed by inflectional prefixes. In Figure 5.3, forms with the prefix 
n(i)- (first person singular) are given. Here, there are no obvious formal 
reasons for setting up such a paradigm with two mood values, three tense 
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INDICATIVE 
PRESENT PAST FUTURE 
NORMAL n-a-fanya ni-li-fanya ni-ta-fanya 


PROGRESSIVE | ni-na-fanya | — — 
PERFECT ni-me-fanya | — — 


HYPOTHETICAL 
PRESENT PAST FUTURE 
NORMAL ni-nge-fanya | ni-ngali-fanya | — 


PROGRESSIVE 
PERFECT 


Figure 5.3 Swahili tense, aspect and mood forms (first person singular, -fanya ‘do’) 


values and three aspect values because there are no word-forms to express 
most of the combinations. From a formal point of view, positing just a single 
feature (‘tense/aspect/mood’) with seven values is simpler and does not 
seem to miss crucial generalizations. Thus, many linguists nowadays work 
with a single feature ‘tense/aspect/mood’. 

The explanation for the different behaviour of the combinations ‘case + 
number’ and ‘tense + aspect + mood’ lies in their semantics. All combinations 
of different cases and numbers are roughly equally plausible because the 
syntactic role of a noun in a sentence is logically independent of whether 
the noun refers to one or many entities. By contrast, certain combinations 
of aspect, tense and mood are unusual or downright exotic. For instance, 
perfective aspect (which implies that an event is viewed in its totality) does 
not go together well with present tense (which implies that the speaker is 
still in the middle of the event). Even more obviously, the imperative mood 
(which expresses a command) does not combine with the past tense. It is 
not surprising that most languages lack straightforward inflectional means 
for these combinations. 

Besides the inflectional values that we have seen up to now, there are 
quite a few others that are less easy to generalize about, but that are also less 
widespread. In English, adjectives have inflectional markers of comparative 
and superlative degree (big, bigger, biggest), but this kind of inflection is not 
common in the world’s languages — it seems to be largely confined to the 
languages of Europe and south-western Asia. 

In verbs, some languages have passive voice inflection, which indicates 
an unusual association of semantic roles and syntactic functions: the 
semantic patient is the syntactic subject (e.g. Swedish kasta ‘throw’, kasta-s 
‘be thrown’). This is the opposite of active voice, in which the semantic 
agentis the syntactic subject. (For more on passives, see Section 11.1.2.) And 
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many languages have inflectional expression of polarity (i.e. affirmative 
versus negative, e.g. Japanese kir-u [cut-Pns] ‘cuts’, kir-ana-i [cut-NEG-PRS] 
‘doesn’t cut’). 

Finally, the most important kind of inflection that we have not already 
discussed is the group of dependent verb forms. Many languages have 
special verb forms that are confined to dependent clauses. Although the 
terminology is not uniform, a rough generalization says that verb forms 
marking relative clauses are called participles, verb forms marking 
adverbial clauses are called converbs and verb forms marking complement 
clauses are called infinitives or masdars. Examples of a participle, a converb 
and an infinitive are given in (5.1)-(5.3). 


(5.1) Korean participle 
Hankwuk-ul pangmwunha-nun | salam-i nul-ko iss-ta. 
Korea-acc  visit-PTCP person-NOM  increase-ing be-DECL 
"Those who visit Korea are increasing. 
(S.-J. Chang 1996: 148) 


Hindi/Urdu converb 

Banie ke bete ne citthti likh-kor daak mé daal-ii. 
grocer POss son ERG letter(F).sG write-cvB box in  put.Psr-rsc 
‘The grocer's son wrote and posted a letter.’ 

(lit. ‘having written a letter, posted (it).") 


(52 


— 


(5.3) Mparntwe Arrernte infinitive 
Re  lhe-tyeke | ahentyene-ke. 
she go-INF X want-PST 
‘She wanted to go.’ 
(Wilkins 1989: 451) 


5.2 Derivational meanings 


Derivational meanings are much more diverse than inflectional values. 
Besides cross-linguistically widespread meanings such as agent noun (e.g. 
drinky — drink-ern), quality noun (e.g. kinda — kind-nessn) and facilitative 
adjective (e.g. readv — read-ableA), we also find highly specific meanings 
that are confined to a few languages. For instance, Big Nambas, a language 
of the South Pacific island of Vanuatu, has a suffix -et that derives reverential 
terms from ordinary nouns (e.g. dui ‘man’ — dui-et ‘sacred man’, navanel 
‘road’ — navanel-et ‘sacred road’ (Fox 1979)). And French has a suffix -ier 
that derives words for fruit trees from the corresponding fruit nouns (e.g. 
pomme ‘apple’ — pomm-ier ‘apple tree’, poire ‘pear’ — poir-ier ‘pear tree’, 
prune ‘plum’ 5 prun-ier ‘plum tree’). 
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There are too many types of derivational meaning to present here, but it 
is worth discussing one frequent characteristic of derivation. Derivational 
patterns commonly change the word-class of the base lexeme — i.e. nouns 
can be derived from verbs, adjectives from nouns, and so on. For such cases, 
the terms denominal (‘derived from a noun’), deverbal (‘derived from a 
verb’) and deadjectival (‘derived from an adjective’) are in general use. 


5.2.1 Derived nouns 


Since creating new words for new concepts is one of the chief functions 
of derivational morphology, and since we have a greater need for naming 
diverse nominal concepts, languages generally have more means for 
deriving nouns than for deriving verbs and adjectives (Bauer 2002). Some 
common meanings with examples from various languages are listed in 
Table 5.2. 


I. Deverbal nouns (V > N) 


agent noun? English drinky >  drink-erw 
Arabic hamalay —  hammaalw 
‘carry’ ‘carrier’ 
patient noun English invitey —  invit-eey 
instrument noun Spanish picary —  pica-doray 
‘mince’ ‘meat grinder’ 
action noun Russian — otkry-t'y —  olkry-tiew 
‘discover’ ‘discovery’ 
II. Deadjectival nouns (A > N) 
quality noun Japanese atarasi-i4 —  atarasi-say 
‘new’ ‘newness’ 
person noun Russian — umn-yja —  umn-ikw 
‘smart, clever’ ‘clever guy’ 
III. Denominal nouns (N >N) 
diminutive noun Spanish gat-o —  qgat-it-o 
'cat ‘little cat’ 
augmentative noun Russian  borod-a —  borod-isca 
‘beard’ ‘huge beard’ 
status noun English child —  child-hood 
inhabitant noun Arabic Misr misr-iyyu 
'Egypt ‘Egyptian’ 
female noun German König >  Kónig-in 
‘king’ ‘queen’ 


Table 5.2 Common derivational meanings of nouns 


3 The glossary gives definitions of the derivational meanings in Tables 5.2—5.4. 
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Besides these widespread derivational meanings, many more specific 
derivational meanings are found in languages, but usually these are 
restricted to a few languages each. Thus, Russian has a suffix for nouns 
denoting kinds of meat (e.g. kon’ ‘horse’, kon-ina ‘horse meat’). Tagalog has 
a pattern for nouns meaning vendors (e.g. kandila ‘candle’, magkakandila 
‘candle vendor’ (Schachter and Otanes 1972: 103)). Various sciences have 
developed terminological conventions for creating new technical terms by 
suffixation (e.g. -itis as a suffix for inflammatory diseases, -ite as a suffix for 
minerals, -ide and -ate as suffixes for certain kinds of chemicals, and so on). 


5.2.2 Derived verbs 


Verb-deriving patterns are generally less numerous and diverse. Most 
commonly, verbs are derived from other verbs. Denominal and deadjectival 
verbs are much less widespread than deverbal verbs (Bauer 2002). Again, 
some typical examples are given in Table 5.3. 


I. Deverbal verbs (V > V) 


causative verb Korean cwuk- => cuuk-i- 
(see Section 11.1.4) ‘die’ ‘kill’ 
applicative verb German laden —  be-laden 
(see Section 11.1.5) ‘load’ ‘load onto’ 
anticausative verb Swedish öppna —  Oppna-s 
(see Section 11.1.2) ‘open (tr.)’ ‘open (intr.)’ 
desiderative verb Greenlandic sini- —  sini-kkuma- 
'sleep' ^want to sleep' 
repetitive verb English write — re-write 
reversive verb Swahili chom-a —  chom-o-a 
‘stick in’ ‘pull out’ 
II. Denominal verbs (N > V) 
‘act like N’ Spanish pirat-a — pirat-ear 
‘pirate’ ‘pirate’ 
‘put into N’ English bottlew —  bottley 
‘cover with N’ Russian sol’ >  sol-it’ 
‘salt’ ‘salt’ 


III. Deadjectival verbs (A V) 


factitive Russian Cern-yj >  cern-it’ 

‘black’ ‘make black’ 
inchoative Spanish verde —  verde-ar 

‘green’ "become green’ 


Table 5.3 Common derivational meanings of verbs 
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5.2.3 Derived adjectives 


Derived adjectives are even less common than derived verbs, because 
adjectives are used more rarely than verbs, let alone nouns. Moreover, 
the semantic class of adjectives that is the most developed in a number 
of European languages, denominal relational adjectives (of the type 
government — governmental), seems to be quite rare in other areas of the 
world. Typical examples of derived adjectives are shown in Table 5.4. 


I. Deverbal adjectives (V > A) 


facilitative Basque jan —  jan-garri 
‘eat’ ‘edible’ 
agentive Spanish habla-r —  habla-dor 
‘talk’ ‘talkative’ 
II. Denominal adjectives (N > A) 
relational Russian korol’ — korol-evskij 
(= ‘related to N’) ‘king’ ‘royal’ 
proprietive Ponapean  pihl >  pil-en 
(= ‘having N’) ‘water’ ‘watery’ 
privative Russian vod-a — bez-vod-nyj 
(= lacking N’) ‘water’ ‘waterless’ 
material German Kupfer >  kupfer-n 
‘copper’ ‘made of copper’ 
III. Deadjectival adjectives (A A) 
attenuative Tzutujil kaq —  kaq-koj 
‘red’ ‘reddish’ 
intensive Turkish yeni > yep-yeni 
‘new’ ‘brand new’ 
negative German schon —  un-schón 
"beautiful ‘ugly’ 


Table 5.4 Common derivational meanings of adjectives 


5.3 Properties of inflection and derivation 


Let us now lookat the properties of inflectional and derivational morphology. 
The ultimate goal is to determine whether inflection and derivation have 
sufficiently different traits as to suggest that they represent two distinct 
subsystems in morphological architecture. We will call the hypothesis of 
a formal distinction the dichotomy approach. The other possibility is that 
we should model inflection and derivation as a continuum, with canonical 
inflection at one extreme, and canonical derivation at the other, but many 
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intermediary types. We call this the continuum approach. This choice has 
broad consequences for the relationship between morphology and syntax 
(discussed in Section 5.5 below), but we start by looking at some of the 
empirical facts underlying this debate. 

Table 5.5 gives an overview of properties that differentiate inflection 
and derivation. Some of these are all-or-nothing properties, and others are 
relative, i.e. a complex word may have the property to a greater or lesser 
extent. We discuss these in turn below. 


Inflection Derivation 

(i) relevant to the syntax not relevant to the syntax 

(ii) obligatory expression of feature not obligatory expression 
(iii) unlimited applicability possibly limited applicability 
(iv) same concept as base new concept 

(v) relatively abstract meaning relatively concrete meaning 
(vi) compositional meaning possibly non-compositional meaning 
vii) expression at word periphe expression close to the base 

P periphery P 
viii) less base allomorph more base allomorph 
phy phy 

(ix) no change of word-class sometimes changes word-class 

(x) cumulative expression possible no cumulative expression 
(xi) not iterable possibly iterable 


Table 5.5 A list of properties of inflection and derivation 


5.3.1 Relevance to the syntax 


(i) flection is relevant to the syntax; derivation is not relevant to the 
syntax. 


For the most part, ‘relevant to the syntax’ means that the grammatical 
function or meaning expressed by a morphological pattern is involved in 
syntactic agreement or syntactic government. 

In syntactic government, one word requires another word or phrase 
to have a particular inflectional value. For instance, Polish verbs that are 
negated often require a direct object in the genitive case (5.4). Verbs that 
are not negated require a direct object in the accusative case (5.5). Since 
the presence or absence of negation leads to a difference of case marking 
on the object, case must be relevant to the syntax. It is therefore possible to 
conclude that case is inflectional in Polish according to criterion (i). 
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(5.4) Tomek nie czytal gazet-y 
tomek.M.NOM.sG not read.3SG.M.PST | newspaper-GEN.SG 
‘Tomek was not reading a newspaper.’ 


(5.5) Tomek czytal gazet-e 
tomek.M.NOM.sG read.3SG.M.PST newspaper-ACC.sG 
‘Tomek was reading a newspaper.’ 


Agreement is a kind of syntactic relation in which the inflectional value 
of a word or phrase (the target) must be the same as the inflectional value of 
another word of phrase in the sentence (the controller) to which it is closely 
related. For instance, in [the boy]vp [walk-s]v and the [girl-s]np [walk]v, the 
target verb walk(s) agrees with the subject NP in number.’ And in this girl 
and these boys, the target demonstrative this/these agrees with its head noun 
(girl/boys) in number. 

Looking back at Table 5.1, we can notice that the most common inflectional 
features for nouns and pronouns are the same as the most common 
inflectional features for adjectives, demonstratives, relative pronouns and 
adpositions. This is because in agreement relations, the controller is almost 
always a noun, pronoun or noun phrase. Adjectives, demonstratives, etc. 
are typical targets for noun agreement. Verbs are also frequent targets for 
number, person, and sometimes gender agreement. 

A word-form may agree with a controller for multiple features, and/or 
agree with multiple controllers. Examples of agreement are shown in (5.6)- 
(5.8). 


(5.6) Agreement of verb with subject and object in person, number and 
gender (Yimas) 
Krayg narmat) k-n-tay. 
frog.sc(G6) ^ woman.sc(G2) 35G.G6.P-35G.G2.AG-see 
‘The woman saw the frog.’ 
(Foley 1991: 194) 


(5.7) Agreement of preposition with complement NP in person and 
number (Classical Nahuatl) 
i-pan noyac 
38G-on my.nose 
'on my nose' 
(Sullivan 1988: 108) 


^ NP stands for noun phrase. 
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(5.8) Agreement of demonstrative and adjective in number and gender 


(Swahili) 

wa-le wa-tu wa-refu 
PL.G2-that | PL-person(c2) PL.c2-tall 
‘those tall people’ 


Note that agreement features are sometimes overtly marked only on the 
target. For instance, in Italian, determiners and adjectives agree with nouns 
for gender. But while Italian nouns are all lexically associated with one 
of the two genders, they do not have morphological marking for gender. 
Examples like il poeta ‘the poet’, la casa ‘the house’, la mano ‘the hand’, il 
cuoco ‘the cook’, la chiave ‘the key’, il fiume ‘the river’ (il = masculine article, 
la = feminine article) show that -a does not in general mean ‘feminine’, 
and -o does not mean ‘masculine’ (despite this commonly being taught in 
language classrooms). In Italian, only the determiners and adjectives have 
morphological gender marking. 

The criterion of syntactic relevance covers most of the features listed in 
Section 5.1, but there is one problematic area: tense/aspect/mood patterns 
are not obviously relevant to the syntax. Tense, and even more so aspect, 
hardly ever occur in an agreement-like relationship, and are not otherwise 
assigned by the syntax. Do we have to consider tense/aspect/mood to be 
derivational by this criterion, rather than inflectional? A slightly modified 
interpretation of ‘relevant to the syntax’ resolves this problem. Specifically, 
it turns out that certain syntactic rules seem to require reference to tense 
and aspect; this allows us to include tense/aspect/mood under the rubric 
of inflection according to this criterion. 


5.3.2 Obligatoriness 


(ii) Inflectional features are obligatorily expressed on all applicable 
word-forms. Derivational meanings are not obligatorily expressed. 


This can be illustrated by Latin: the lexeme INSULA ‘island’ has ten word- 
forms in its paradigm, and each word-form expresses (and must express) 
one value from each of the features ‘number’ and ‘case’. The Latin speaker 
thus had no choice about whether to use a noun with or without case and 
number features — omitting these features was impossible. Number and 
case are thus inflectional features in Latin according to this criterion. (Note 
that inflection need not be expressed via an overt suffix. For instance, the 
paradigm of the Spanish verb camına ‘walk’ contains the form camina 
's/he walks', with no affix directly corresponding to the third person 
singular meaning. But here the absence of an affix is meaningful in itself; 
this is not an uninflected form, but an inflected form with zero expression.) 
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By contrast, expression of a given derivational meaning is not obligatory. 
The English suffix -er applies to verbs to derive nouns with the meaning of 
'agent', e.g. DRINKER. But it is not the case that all nouns must express an 
agentive meaning. 


5.3.3 Limitations on application 


(iii) Inflectional values can be applied to their base without arbitrary 
limitations; derivational formations may be limited in an 
arbitrary way. 


Generally speaking, a lexeme's paradigm contains a full set of inflected forms: 
verbal paradigms have word-forms expressing all the tense-aspect-mood 
values that are relevant to the language, noun paradigms have word-forms 
expressing all relevant case-number combinations, adjectival paradigms 
have all relevant comparative forms, etc. This is because a lexeme that does 
not have a full set of forms cannot function in every syntactic context. And 
notably, when exceptions do occur, this can usually be explained easily by 
the incompatibility of the inflectional meaning and the base meaning, i.e. 
the problematic syntactic context never arises in the first place. For instance, 
stative verbs may not have certain aspectual forms (e.g. English *She is 
knowing me), collective nouns may have only singular or only plural forms 
(e.g. English information, *informations), and non-gradable adjectives do 
not have comparative forms (e.g. *Mammoths are deader than Neanderthals). 
Incomplete paradigms whose gaps are not semantically motivated are very 
rare (see Section 8.7 for discussion of these exceptions). 

In comparison, arbitrary derivational gaps are quite common. 
Conceivable derived lexemes may be lacking without any obvious 
semantic explanation. For instance, English has female nouns in -ess such 
as authoress, heiress, priestess, but it is not possible to say "professoress 'female 
professor’, *presidentess ‘female president’, and so on, although these make 
perfect sense semantically. The Spanish inchoative formation in -ear (see 
Table 5.3) occurs with colour adjectives (verde — verdear ‘become green’, 
negro — negrear ‘become black’, etc.), but it cannot be used freely with other 
adjectives where a ‘become’ sense would be just as appropriate and useful 
(e.g. caro — *carear ‘become expensive’ - this word does not exist). 


5.3.4 Same concept as base 


Some properties are best discussed in terms of canonical inflectional traits 
and canonical derivational traits. 


(iv) Canonical inflected word-forms express the same concept as the 
base; canonical derived lexemes express a new concept. 
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While everyone would probably agree that the same concept is expressed in 
go and goes, or in Latin insula (‘island.NoM.sc’) and insulae (‘island.GEN.sc’), 
this is less clear with singular-plural pairs in nouns. For instance, at one 
point in the history of English the plural of brother was brethren. But at a later 
stage, brethren took on the specialized meaning of members of a Christian 
fellowship, and came to be interpreted as a separate lexeme. A new plural 
(brothers) was created to pair with brother in the meaning of male sibling. 
This kind of split into separate lexemes shows that the singular brother and 
plural brethren expressed somewhat different concepts. Number in nouns is 
thus sometimes more similar to derivation according to criterion (iv) than 
to canonical inflection. And on the other side, derivation does not always 
lead to an obviously new concept. Although ‘baker’ is clearly a different 
concept from 'bake', in what sense is 'kindness' a different concept from 
‘kind’? This example also seems to fall into the middle ground between 
inflection and derivation. 


5.3.5 Abstractness 


(v) Inflectional values express a relatively abstract meaning; 
Derivational meanings are relatively concrete. 


The abstractness criterion works quite well for inflectional meanings, 
because all of them are highly abstract (in some intuitive sense). And many 
derivational meanings are quite concrete (e.g. French -ier, which denotes a 
kind of tree). But there are also derivational meanings that are just as abstract 
as inflectional meanings (e.g. the meaning ‘status’ of -hood in childhood). So 
-hood is neither canonically derivational, nor canonically inflectional. 


5.3.6 Meaning compositionality 


(vi) Canonical inflected word-forms have compositional meaning; 
canonical derived lexemes have non-compositional meaning. 


While inflectional values usually make a predictable semantic contribution 
to their base, derived lexemes are often semantically idiosyncratic. For 
instance, the Russian derivational suffix -nik means ‘thing associated with 
(base concept)’, and this meaning is clearly present in dnev-nik ‘diary’ 
(dn-ev- ‘day’), no&nik ‘night lamp; night worker’ (noč ‘night’). However, 
the meaning of dnevnik is not exhausted by that of dnev- and -nik: a diary 
is indeed a kind of thing associated with days (or daily activities), but the 
additional meaning components ‘notebook’ and ‘used for writing’ cannot 
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be predicted on the basis of the meaning of the two constituent morphemes 
and must be associated with the lexeme as a whole. Even clearer examples 
are ignorance and reparation; their meanings are probably only historically 
related to ignore and repair. 

But of course, some derivational formations exhibit compositional 
meaning. For instance, the meaning of German female nouns with -in 
(Kónig-in ‘queen’, Professor-in ‘female professor’) is very regular. The suffix 
-in is by most criteria clearly derivational, but in this respect it shows a 
property that is more typical of inflection. 


5.3.7 Position relative to base 


(vii) Canonical inflection is expressed at the periphery of words; 
canonical derivation is expressed close to the root. 


This property can be used as distinguishing criteria only in special 
circumstances because it is a relative property, and not an absolute one. The 
first is best illustrated by words that have one derivational affix and one 
inflectional affix on the same side of the root. In such cases, the derivational 
affix almost always occurs between the root and the inflectional affix: 


(5.9) a. English king-dom-s root- status (D) - plural (I) 

b. English real-ize-d root — factitive (D) — past tense (I) 

c. English luck-i-er root — proprietive (D) — comparative (I) 

d. Turkish ic-ir-iyor root - causative (D) - imperfective aspect (I) 
[drink-cAUS-IMPF.3SG] 
‘makes (somebody) drink’ 

e. Arabic  na-ta-labbasa 1st plural subject (I) — reflexive (D) — root 
[1PL-REFL-clothe.PRF] 
‘we clothed ourselves’ 


When there are more than two affixes, normally all the derivational 
affixes occur closer to the root than the inflectional affixes (e.g. German 
nation-al-isier-te-n ‘(they) nationalized’: root — relational adjective (D) - 
factitive verb (D) — past tense (I) - third person plural subject agreement (I)). 

Yet here again, it is possible to find examples of inflection and derivation 
that do not exhibit the canonical traits. For example, German has deadjectival 
factitive verbs that are based on the inflectional comparative form (e.g. 
schön ‘beautiful’  schón-er ‘more beautiful’  ver-schün-er-n ‘make more 
beautiful’); the inflectional comparative affix -er is closer to the root than the 
derivational affix -n. And English allows plurals inside many compounds 
(e.g. publications list, New York Jets fan). 
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5.3.8 Base allomorphy 


(viii) Inflection induces less base allomorphy; derivation induces more 
base allomorphy. 


Base allomorphy is also a relative property. It can be demonstrated with 
roots that show allomorphy in derived lexemes, but not in inflected word- 
forms: 


(5.10) ROOT INFLECTED FORM DERIVED LEXEME 

a. English destroy destroy-ed destruc-tion 

b. English broad broad-er bread-th 

c. German Erde Erde-n ird-isch 
‘earth’ ‘earths (PL)’ ‘earthly’ 

d. Latin honor honor-is hones-tus 
‘honour’ "honour-GEN' "honest 

e. Italian dialogo [-g-] dialogh-i [-g-] dialogico [-d3-] 
‘dialogue’ ‘dialogue-s’ ‘dialogical’ 

f. Arabic kataba katab-tu kitaab 
‘he wrote’ ‘I wrote’ "book" 


But the opposite pattern can also be found, as in the following examples 
from Serbian: 


(5.11) ROOT INFLECTED FORM DERIVED LEXEME 
junak junac-i junak-inja 
“hero (MY ‘heroes’ ‘heroine’ 
pesnik pesnic-i pesnik-inja 
“poet (M) ‘poets’ ‘poet (FY 
psiholog psiholoz-i psiholog-inja 
‘psychologist (M) ‘psychologists’ "psychologist (F)’ 
monah monas-i monah-inja 
‘monk’ ‘monks’ ‘nun’ 


Base allomorphy is thus yet another tendency, according to which a 
morphological pattern may be more typically inflectional, or more typically 
derivational. 


5.3.9 Word-class change 


(ix) Canonical inflection does not change the word-class of the base; 
derivational affixes may change the word-class of the base. 
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It is often claimed that derivational formations may change the word-class 
of the base, but inflectional ones never do. While the first part of this claim 
is true (e.g. many of the examples in Section 5.2 consisted of word-class- 
changing derivation), the second part is questionable. There seem to be two 
types of word-class-changing morphological patterns — one that is typical 
of derivational patterns, and one that is associated with patterns that are, to 
some degree, inflectional. 

In most instances, when word-class-changing affixes are added to a base, 
the grammatical properties of the base are no longer relevant for purposes 
of agreement. In (5.12), the Russian adjective otkrytoe ‘open’ agrees for 
gender (and number and case) with the noun okno ‘window’, but when 
the denominal adjective okonnaja ‘window (adj)’ is derived from okno, the 
nominal stem can no longer serve as the controller for agreement. This is a 
typical consequence of (derivational) word-class-changing operations, and 
the reason (5.12b) is ungrammatical. 


(5.12) Russian 


a. otkryt-oe okno 
Open-N.SG.NOM = window.N.SG.NOM 
‘open window’ 


b. *otkryt-oe okon-naja rama 
y J 
Open-N.SG.NOM window-F.SG.NOM frame.F.SG. NOM 
‘open window frame’ (i.e. ‘frame of an open window’) 


However, some languages have affixes that can occur in structures 
parallel to (5.12b). Consider the example in (5.13), from another Slavic 
language: Upper Sorbian. Here, mojeho ‘my’ agrees for gender with the 
masculine noun muž ‘husband’, despite this being the root of the denominal 
adjective muZowa. 


(5.13) moj-eho muz-ow-a sotra 
My-M.SG.GEN  husband-Poss-F.5G.NOM sister.F.SG.NOM 
^my husband's sister' 
(Corbett 1987: 303) 


Thus, -ow seems to have the property that it is word-class changing, but in 
a way that allows the properties of its base to still control agreement by a 
modifying adjective. 

Moreover, and crucially, -ow meets some of the criteria for inflection, 
e.g. it has fully compositional meaning. It is also highly productive (a trait 
not discussed in detail here, but one that is typical of inflection). Perhaps, 
then, no clear-cut binary distinction can be made between (derivational) 
affixes that change word-class, and (inflectional) affixes that do not (Corbett 


98 CHAPTER 5 INFLECTION AND DERIVATION 


1987; Haspelmath 1996). It is better to say that canonical inflection does not 
change word-class. Word-class-changing inflection is discussed further in 
Section 11.4. 


5.3.10 Cumulative expression 


(x) Inflectional values may be expressed cumulatively; derivational 
meanings are not expressed cumulatively. 


This criterion applies only to a small subset of cases, but is nevertheless 
interesting. We saw above that several inflectional values may be expressed 
by a single affix, as in Latin insularum ‘of the islands’, where the suffix -arum 
expresses both ‘genitive’ and ‘plural’. Such cases of cumulation seem to be 
very rare in derivational formations, but a possible example is Dutch -ster 
‘agent’ and ‘female’. 


5.3.11 Iteration 


(xi) Inflectional values cannot be iterated; derivational meanings can 
sometimes be iterated. 


Inflection is more restricted in that inflectional affixes cannot be iterated. 
Thus, although it would make sense logically to have an iterated plural 
(e.g. "cat-s-es ‘sets of cats’), such double plurals are virtually unattested. Or 
one could imagine a past-tense affix to be repeated to give a sense of remote 
past (e.g. "didded ‘had done’). With derivational formations, iteration is 
not common either, but it is possible, for instance, with diminutives in 
Afrikaans (kind-jie-tjie ‘a little little child’), and with various prefixes in 
English (post-post-modern) and German (Ur-ur-ur-grofvater ‘great-great- 
great-grandfather’). Another instance is the double causative, as we find 
it in Huallaga Quechua: wañu- ‘die’, wafiu-chi- ‘kill’, wafiu-chi-chi- ‘cause to 
kill’ (Weber 1989: 164). This property also applies to only a small number 
of morphological patterns, those for which iteration would be semantically 
plausible. 


5.4 Dichotomy or continuum? 


Do these facts indicate that the difference between inflection and derivation 
is a dichotomy, or a continuum? It turns out that there is less disagreement 
about the facts themselves, and more disagreement about the importance of 
some facts. Proponents of the dichotomy approach tend to consider the first 
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three properties in Section 5.3 (relevance to the syntax, obligatoriness and 
limitations on applicability) to be the most important, especially relevance 
to the syntax. And inasmuch as these three criteria are logically independent 
of each other, but nonetheless tend to categorize a given morphological 
meaning in the same way, either as derivation or as inflection, proponents 
of the dichotomy approach have argued that these traits are indicative of a 
distinction between inflection and derivation in the formal architecture of 
the morphological system. 

By contrast, the reason some morphologists prefer the continuum 
approach is that they want to avoid making an arbitrary choice from the 
criteria in Table 5.5. Proponents of the continuum approach thus tend to 
consider the properties as a collective whole. If all these criteria are taken 
seriously, then the continuum approach is almost inevitable, because 
different criteria may point to different conclusions. But what is particularly 
interesting is that the mismatches between the criteria are not random, 
but present a surprisingly orderly picture. As an example, let us look at 
Table 5.6. It gives a sample list of six morphological formations, which are 
evaluated by five of the eleven criteria. 


Language Formation Example cum obl new unl cm 
English 3rd singular — walk/walks I I I I I 
English noun plural  song/songs D I I I I 
Spanish diminutive — gato/gatito D D I I I 
English repetitive write/rewrite D D D I I 
English female noun  poet/poetess D D D D I 
English action noun  resent/resentment D D D D D 


Note: cum- cumulative expression; obl = obligatory; new = new concept; 
unl = unlimited applicability; cm = compositional meaning. 


Table 5.6 A continuum from inflection to derivation 


Table 5.6 is a simplification in various respects (e.g. in that it ignores the 
difficulties in applying some of the criteria), but it suffices to illustrate the 
continuumapproach. The Englishthird personsingularsuffix-scumulatively 
expresses person/number and present tense; the other formations show 
no cumulation. Both verbal agreement and nominal number are arguably 
present in any verb and noun form, so these two are obligatory, whereas 
this is not the case for the other formations. Diminutives are like classical 
inflected forms in that they do not (necessarily) denote a new concept - 
Spanish gatito often refers to the same kind of cat as gato, but occurs only 
under special pragmatic circumstances. Only the English female suffix -ess 
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and the action-noun suffix -ment are limited in applicability, and only -ment 
is semantically irregular (as we can see in govern/government, which shows 
a different semantic relation from resent/resentment). On such a continuum 
view, agreement morphology such as -s (in walks) is canonical inflection and 
English action nouns in -ment are canonical derivation, but they are merely 
extremes of a continuum on which many intermediate items are found as 
well. 

In the end, linguists must decide the relative merit of these eleven criteria 
as diagnostics of inflectional/derivational morphology. If we consider 
(i)-(ii) to be the most important diagnostics, this naturally leads to the 
conclusion that inflection and derivation are dichotomous categories, and 
as a result, that inflectional and derivational rules (potentially) operate in 
the architecture of the language system according to distinct principles. 
However, if all traits are considered equally important indicators, the best 
conclusion is that canonical derivation and inflection are end-points along a 
continuum of morphological properties, and no sharp division can be made 
between them. 


5.4.1 Inherent and contextual inflection 


Before considering the formal architecture of inflection and derivation, we 
should consider one other approach. Specifically, one popular modification 
of the dichotomy approach divides inflectional morphology into two 
subtypes: inherent inflection and contextual inflection. A tri-partition takes 
better account of the full range of characteristics described above, while 
maintaining sharp boundaries between types. 

Inherent inflection comprises features that are relevant to the syntax 
but convey a certain amount of independent information. These include 
a verb's tense and aspect values, and the number values for nouns. For 
example, the number value of a noun is mostly dictated by the real-world 
entity that the noun refers to (the referent) — if the referent is one, the noun 
is singular. If it is more than one, the noun is usually plural (assuming the 
language has only two inflectional values for number). In this sense, nominal 
number contributes independent information to syntactic structure. Some 
grammatical cases can also be inherent, for instance, locative (e.g. Turkish 
ev-de [house-Loc] ‘in the house’), ablative (e.g. Huallaga Quechua mayu-pita 
[river-ABL] ‘from the river’) and instrumental (e.g. Russian noz-om [knife- 
INS] ‘with a knife’), which similarly make their own semantic contribution 
and are mostly not syntactically determined. 

By contrast, contextual inflection comprises values which are assigned to 
a word because of the syntactic context in which it appears. Included here 
are structural cases — i.e. cases like nominative, accusative and genitive, 
which are typically required by syntactic agreement or government but 
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express largely redundant information. Note that inflectional features 
may be inherent for one word-class and contextual for another. Number 
is inherent for nouns because it expresses independent information, but 
contextual for adjectives, verbs, etc. where they agree with the noun for this 
feature. 

A distinction between inherent and contextual inflection is appealing in 
part because inherent inflection often shares properties with derivation, 
whereas contextual inflection does not to the same degree. Here we 
highlight three examples. 

First, although both contextual and inherent inflection often have 
compositional meaning, on the relatively rare occasion that inflection has 
an unpredictable, idiosyncratic meaning, the relevant examples come from 
inherent inflection. Some examples from Dutch are given in (5.14). 


(5.14) value inflected compositional idiosyncratic 
word meaning meaning 
comparative ouder ‘older’ ‘parent’ 
plural vaders ‘fathers’ ‘forefathers’ 
past participle bezeten possessed" ^mad' 
present participle ontzettend 'appalling' ‘very’ 
infinitive eten '(to) eat ‘food’ 


(Booij 1993) 


Plural forms of verbs and other examples of contextual inflection do not 
seem to exhibit this kind of idiosyncrasy. 

Second, just as derivational patterns tend to be closer to the root than 
inflectional patterns, inherent inflectional patterns tend to be closer to the 
root than contextual ones. Moreover, those exceptional cases in which an 
inflectional affix is closer to the root than a derivational affix, and those in 
which an inflectional affix occurs on a first compound member, generally 
involve inherent inflection (see the discussion following (5.9) for an 
example). 

Finally, inherent inflection is more likely to induce base allomorphy than 
is contextual inflection. A few examples are given in (5.15). 


(5.15) contextual inflection inherent inflection 
English sing/sings (person/number — sing/sang (past tense) 
agreement) 
German warm-er/warm-e (gender warmfwürmer (comparative) 
agreement) 
^warm-MASC/ warm-FEM' ‘warm/warmer’ 
Arabic  kitaab-un/kitaab-in kitaab-/kutub (plural) 


(structural case) 
"book-NOM/book-GEN' "book.sc/book.rr^ 
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Thus, the conceptual distinction between contextual and inherent 
inflection is useful because there are a number of points on which the two 
kinds of inflection behave rather differently. Proponents of the continuum 
approach take this as continued evidence that canonical inflection and 
canonical derivation are end-points on a continuum of traits, but proponents 
of the dichotomy approach sometimes work with a tri-partition, rather than 
a strict dichotomy. 


5.5 Inflection, derivation and the syntax-morphology 
interface 


Perhaps the most interesting thing about the debate between the dichotomy 
and continuum approaches is that these perspectives have significant 
consequences for our formal description of morphological architecture. 
In particular, morphologists have debated whether inflectional and 
derivational rules are collected together into a single morphological 
component of the grammar, or split between two different components. The 
continuum approach allows for only the former argument. The dichotomy 
approach is consistent with either, but perhaps more naturally aligned with 
the proposal that inflection and derivation occupy separate areas of the 
grammar. The dichotomy and continuum perspectives thus suggest a very 
different formal relationship between inflection, derivation, and syntax. 


5.5.1 The dichotomy approach and split morphology 


As we have seen, morphologists who adopt the dichotomy approach think 
of inflection and derivation as having fundamentally different properties, 
and they usually take the first property of Table 5.5 — relevance to the syntax 
-as the crucial criterion for distinguishing inflection from derivation. Given 
this perspective, it is logical to conclude that inflectional and derivational 
patterns are governed, at least in part, by distinct grammatical principles. 
In essence, there is not one coherent morphological system, but rather two 
systems or subsystems — one for derivation and compounding, and one for 
inflection. 

Whatis the relationship of these two systems to each other, and to syntax? 
One proposal argues that rules of derivation and compounding (i.e. all of 
word-formation) operate in a component of the grammar that feeds into the 
syntax, and that inflectional rules apply only after the syntactic rules have 
applied. In other words, word-formation is pre-syntactic, inflection is post- 
syntactic. This is referred to as the split-morphology hypothesis, and the 
architecture of the grammar is shown schematically in Figure 5.4. 
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Figure 5.4 Word-formation and inflection in a split-morphology architecture 


Let us look at a concrete example of how this works. Our example sentence 
is (5.16) from Latin. 


(5.16) Imperator saluta-v-it popul-um. 
emperor(NOM.SG) greet-PRF-3SG people-Acc.Pr 
‘The emperor greeted the people.’ 


The Latin lexicon contains simple lexemes such as IMPERARE ‘command’, 
SALUS ‘health’ and PoPurus ‘people’. The word-formation rules create 
complex derived lexemes such as IMPERATOR 'commander, emperor' and 
SALUTARE ‘greet’. Word-formation is said to operate ‘in the lexicon’ (i.e. in 
this approach, the lexicon contains both a list and rules), so both simple 
lexemes and derived lexemes are the output of the lexicon. The syntax 
contains phrase structure rules (e.g. 5 > NP VP, VP > V NP) ('VP' is an 
abbreviation of verb phrase), case-assignment rules, which among other 
things ensure that the direct object gets accusative case ([vP V NPacc]), 
and agreement rules, which ensure that the inflectional values on the target 
match those on the controller. The syntactic rules might thus generate an 
abstract representation, as in (5.17) (here, the subject NP is the controller 
for person and number agreement, and the verb is the target). The feature- 
value notations at the end of the tree are often called morphosyntactic 
representations. All of this is of course greatly simplified, but is sufficient 
for our present purposes. 


(5.17) S 
NP VP 
V NP 
CASE: NOM 


NUMBER: SG 


PERSON: 3RD NUMBER: SG CASE: ACC 
PERSON: 3RD NUMBER: SG 
TENSE: PRF PERSON: 3RD 
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The lexemes from the lexicon are inserted into these abstract syntactic 
representations, yielding a lexically specified syntactic representation as 
shown in (5.18) (the representation with labelled brackets and subscripts is 
equivalent to the tree representation in (5.17) and saves space). 


(5.18) [s [NP IMPERATORNOM/SG/3RD] [vP [v SALUTARESG/ 3RD / PRF] 
[NP POPULUS-ACC/sG/ 380] ]] 


Now the rules of inflection operate and create the correct word-forms 
from the lexemes with their feature specification: IMPERATORNOM/ 
SG/ 3RD becomes imperator, SALUTAREsG/ 3RD/PRF becomes salutavit, and 
POPULUSACC/SsG/3np becomes populum. This gives us the correct output 
(once phonological rules of pronunciation have applied): Imperator salutavit 
populum. 

In addition to its intuitive plausibility, this architecture of the formal 
grammar is often claimed to have one significant advantage: it explains 
the fact that derivation is generally 'inside' inflection, i.e. it occurs closer 
to the root. If affixes are always attached peripherally by morphological 
rules, then the affix order of king-dom-s automatically follows from the 
order of application of the rules in Figure 5.4. The lexicon creates KINGDOM 
from the simple lexeme KING, and the inflection -s is added after the 
syntactic component. There is no way a form like "king-s-dom could ever 
arise, because inflected forms like king-s cannot be the input to word- 
formation rules. Inflected forms should also not occur inside compounds, 
because compounding is a lexeme-forming rule in the lexicon. Thus, the 
impossibility of *trees plantation in English follows from this as well (the 
correct form is tree plantation, where the first part is uninflected, despite the 
plural meaning). This seems to lend further support to the split-morphology 
claim. 

At the same time, however, split morphology encounters some empirical 
problems. First and most obviously, exceptions in which inflection occurs 
inside derivation are occasionally observed; we have already encountered 
the German example ver-schén-er-n ‘make more beautiful’, in which the 
(inflectional) suffix -er is closer to the root than the (derivational) suffix 
-n. Such exceptions cannot obviously be accommodated under the split- 
morphology proposal. Even more importantly, as we saw in Section 5.4.1, 
inherent and contextual inflection sometimes behave quite differently, 
with inherent inflection having characteristics typical of derivation. This 
presents an intrinsic problem for split morphology, which groups inherent 
and contextual inflection together. 

Attentivereaders mightsuggest that perhapsthe solutionis to maintainthe 
split-morphology hypothesis, but divide the two grammatical components 
such that derivation, compounding and inherent inflection apply pre- 
syntactically, and only contextual inflection applies post-syntactically. 
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Based on the evidence presented so far, this would seem to be reasonable. 
However, further evidence shows that this is not a satisfactory solution 
either. The most important data here is the fact that the same morphological 
pattern may express both inherent and contextual inflection. For instance, 
in Latin nouns, a single morpheme marks both number (inherent) and case 
(often contextual). This cumulative expression indicates an interaction of 
inherent and contextual inflection that is extremely difficult to account for 
if they are located in separate grammatical components. 

Note that the dichotomy approach does not require us to accept the 
split-morphology hypothesis. Another possibility is that derivation and 
inflection operate according to partially different principles, but without 
them being split into pre- and post-syntactic components of the grammar. 
Still, split morphology is the most extreme logical interpretation of the 
dichotomy perspective. 


5.5.2. The continuum approach and single-component 
architecture 


According to the continuum approach, no firm distinction between inflection 
and derivation can be made, so inflectional and derivational rules must 
operate in the same grammatical component according to fundamentally 
similar grammatical principles. This, of course, has consequences for the 
relationship between morphology and syntax: the continuum approach 
is not consistent with the split-morphology hypothesis. Instead, it is 
consistent with the idea that syntax generates abstract structures containing 
morphosyntactic representations but no lexical information, as in (5.17), 
but instead of inserting a (derived) lexeme at each syntactic node and then 
generating an inflected form, a word-form is inserted that is consistent with 
the syntactic information. This does not require derivation and inflection 
to be split between pre- and post-syntactic operations. This grammar 
architecture is represented in Figure 5.5. 

In this way, the job of the morphological component is to provide word- 
forms whose inflectional values match (or at least, do not contradict) those 
of the morphosyntactic representation. 


word-formation 
morphosyntactic n phonology, 
pronunciation 


representation 


inflection 


Figure 5.5 Word-formation and inflection in a single-component architecture 
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The dichotomy approach is also consistent with the idea that all of 
morphology iscollected into one component of the grammatical architecture, 
but does not align with a single-component hypothesis as naturally as the 
continuum approach does. 

The single-component hypothesis cannot be used to explain why 
derivational affixes occur closer to the root and inflectional affixes occur 
more peripherally, but perhaps this should not worry us. The ordering 
of inflectional affixes with respect to derivational affixes is not the only 
generalization that can be made. Even within inflectional affixes and 
within derivational affixes, some orders are strongly preferred, and others 
are strongly dispreferred. For instance, the diminutive suffix in Spanish 
is always outside other derivational suffixes (e.g. the female noun suffix 
-es(a): baron-es-ita "little baroness', not "baronitesa). And case suffixes almost 
always follow number suffixes, rather than vice versa (e.g. Turkish ev-ler-in 
[house-PL-GEN] ‘of the houses’, not *ev-in-ler). 

There may be a system-external explanation for these affix ordering 
patterns. It has recently been proposed that the ability of two affixes to 
combine, and the order in which they do so, depend in part on constraints 
created by how words are mentally processed (Hay and Plag 2004). Affixes 
that are more likely to be accessed in the lexicon via the decomposition 
route (as opposed to the whole-word route) occur outside of affixes that are 
less likely to be decomposed (see Chapter 4 for discussion of lexical access). 
Thus, restrictions on the order of affixes may not be something that needs to 
be captured directly by our description of morphological architecture at all. 
The possibility of a system-external explanation suggests that the inability 
of a single-component architecture to explain affix order is not necessarily a 
disadvantage of the model. 


Summary of Chapter 5 


Morphologists use different terminology for talking about inflection 
and derivation. Inflection is described in terms of values grouped into 
features; derivation is described in terms of individual morphological 
patterns and their meanings. The range of inflectional meanings 
found in languages is severely restricted; most of them fall under the 
general headings of number, gender, case, person, tense, aspect and 
mood. Derivational meanings are more varied, but many recurrent 
types can be identified as well. 

Apersistent question in morphological theory is whether differences 
between inflection and derivation fall along a continuum, or sharply 
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divide inflection and derivation into distinct types. Linguists 
adopting the dichotomy approach have usually emphasized criteria 
such as relevance to the syntax and obligatoriness, whereas linguists 
favouring the continuum approach have considered a whole range 
of criteria, including compositional meaning, cumulative expression 
and closeness to the root. Within inflection, a distinction between 
(more derivation-like) inherent inflection and contextual inflection 
can be made. 

Formally, the dichotomy approach is consistent with a theory that 
splits derivational and inflectional rules into separate components that 
apply pre- and post-syntactically, respectively (the split-morphology 
hypothesis). By contrast, some formalizations of the dichotomy 
approach and all instantiations of the continuum approach treat 
inflection and derivation as comprising a single-morphological area 
(the single-component hypothesis). While there is little consensus 
about how distinct inflection and derivation are, a number of empirical 
issues argue against split morphology. 


Appendix. Notation conventions for inflectional values 


There are three widespread conventions used to represent inflectional 
values. The format of Figure 5.6, sometimes called a grid, is the standard 
way to represent the set of word-forms (i.e. the paradigm) of a lexeme. 


English Spanish 


PRESENT PAST SINGULAR PLURAL 
TSE Lu Ist | camin-o | camina-mos, 


|_walkt-s)|_ walked | DK camina-s | caminá-is 


camina- | camina-n 


Figure 5.6 Inflectional features and values 


In a grid, values of the same feature are shown in columns or rows that 
are usually labelled with the name of the feature. Each combination of 
inflectional values defines a cell. In Figure 5.6, values are printed in small 
capitals and features are enclosed in ovals. The first paradigm is from 
English, where verbs primarily inflect for tense. The second paradigm is 
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from Spanish, where verbs inflect for two values of the feature ‘number’ 
(singular and plural) and three values of the feature ‘person’ (first, second 
and third) (they also inflect for tense — see below). 

When a lexeme inflects for three features simultaneously, a two- 
dimensional representation is no longer sufficient, and we would need 
a three-dimensional grid. Figure 5.7 is an attempt at drawing one. For 
practical purposes, three-dimensional (and especially n-dimensional, for n 
» 3) paradigms are mostly shown in two spatial dimensions as well. Thus, 
Figure 5.7 is generally replaced by Figure 5.8. 


Spanish C number 


SINGULAR PLURAL 
in-o ina-mos 
DK 2ND l'eamiima-. caminácjs 
3RD | camina. camina-n 
Present 


Ist | camina-ba-Ø | camind-ba-mos 
21D | camina-ba-s camina-ba-is 
tense D Past) 3RD | camina-ba-@ | camina-ba-n 


Figure 5.7 A three-dimensional representation of a three-dimensional paradigm 


PRESENT TENSE 


SINGULAR PLURAL 
camin-o camina-mos 


2ND | camina-s 
3RD | camina-O 


caminá-is 


camina-n 


PAST TENSE 


SINGULAR 


camina-ba-0 
camina-ba-s 


PLURAL 
camind-ba-mos 
camina-ba-is 

camina-ba-n 


Ist 
2ND 
3RD | camina-ba-0 


Figure 5.8 A two-dimensional representation of a three-dimensional paradigm 
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The inflectional information contained in a word-form can also be 
represented in a feature-value notation, as in the examples in (5.19). 
Feature-value notation indicates the inflectional feature on the left side of 
the colon, and the inflectional value on the right, with all ‘feature: value’ 
pairs enclosed in square brackets. This notation is most commonly used 
when writing lexical entries or word-schemas. Thus, (5.20) is a more 
informative and formal way to represent (5.21). 


(5.19) a. Spanish 
caminábamos 
^we were walking' 


TENSE: PAST 
PERSON: 1ST 
NUMBER: PLURAL 


b. Sanskrit 
datrnoh NUMBER: DUAL 
‘of two givers’ GENDER: NEUTER 
CASE: GENITIVE 
(5.20) | /X/N /Xz/N 
‘x’ o ‘x’ 
[ NUMBER: SINGULAR ] [ NUMBER: PLURAL ] 
(5.21) | /X/N o /Xz/N 
“x! ‘plurality of xs’ 


Finally, for practical purposes, the inflectional values may also be written 
as subscripts of word-forms, e.g. caminábamos1pr sr, dátrnol pu.N.GEN: 


Further reading 


A useful survey of the kinds of meanings that are expressed by derivational 
morphology is found in Bauer (2002). 

The dichotomy and split-morphology approach to inflection and 
derivation is represented by works such as Scalise (1988a), Perlmutter 
(1988) and Anderson (1992), and it is implicit in much further work. The 
continuum approach is defended by Stephany (1982), Bybee (1985), Dressler 
(1989) and Plank (1994) (and see Wurzel (1996)). The tripartition between 
contextual inflection, inherent inflection and derivation was proposed by 
Booij (1993, 1996). 

In this chapter we assumed that morphology operates largely 
separately from syntax, and focused on the question of whether there is 
one morphological component or two. But not all theories make this 
assumption, including Word Syntax and Distributed Morphology: see 
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Toman (1998) and Embick and Noyer (2007), respectively, for overviews of 
the syntax-morphology interface in these theories. Also, see the discussion 
in Chapter 9. 


Comprehension exercises 


1. 


Give the inflectional information of the following word-forms in 
feature-value notation (see (5.19)): 


Spanish caminabas (3 features, see Figure 5.8) 
Latin insulam (2 features, see Figure 5.1) 
Latin cantabit (5 features, see Figure 5.2) 
English books (1 feature) 

Serbian ovci (2 features, see (4.3)) 
Classical Nahuatl  incal (2 features, see (2.7)) 
English bigger (1 feature) 


Lezgian verbs have suffixes for aspect (-zawa imperfective, -nawa 
perfect, -da habitual), followed by suffixes for polarity (- affirmative, 
-č negative), followed by suffixes for tense (-O present, -j/-ir past; -ir 
is chosen after -&). For instance Katzawajrmwpr.Arr.PsT ‘Was running’, 
katdacirggAg.NgG.Psr Would not run’. Give the whole three-dimensional 
paradigm in a two-dimensional representation (as in Figure 5.8), using 
the verb kat- ‘run’ (i.e. a grid with 3 x 2 x 2 = 12 cells) (Haspelmath 
1993). 


Consider the meanings of the following denominal and deadjectival 
verbs of English and classify them using the categories of Table 5.3. For 
some of them, you need to set up new categories not represented in that 
table. 


butter, flatten, categorize, peel, legalize, phone, blacken, cannibalize, unionize, 
skate, modernize, terrorize, ski 


At the beginning of this chapter, we asked whether the English 
deadjectival adverb-forming pattern (nice — nicely) is inflectional or 
derivational. Apply the criteria of Section 5.3 and try to form an opinion 
on this question. 


Exploratory exercise 


In language research, data sometimes point to opposite conclusions 
regarding the structure of the linguistic system. And when this happens, 
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linguists do not always agree on which data is most important and 
convincing. The following is primarily a thought exercise designed to help 
you develop your own arguments about what constitutes (un)convincing 
data. It also allows you to practise applying the criteria in Table 5.5. You 
will use the results to decide for yourself which approach — continuum or 
dichotomy — seems to be the best way to describe the differences between 
inflection and derivation. 


Instructions 


Step 1: Pick a variety of morphological patterns in a language of your choice. 
You might want to choose several morphological patterns that express 
a similar meaning /function, for instance, all of the patterns expressing 
plurality in nouns, and/or all of the patterns that turn verbs into abstract 
nouns. English examples for these functions are given in Table 5.7 and Table 
5.8 below, but you need not be limited by these (see Sections 5.1 and 5.2 for 
ideas). In fact, you do not need to work on English at all, but it is helpful to 
work on a language that you know quite well. 

Step 2: Using a good descriptive grammar of the language and/or your 
own knowledge, apply the eleven criteria for distinguishing inflection from 
derivation. For easy reference, we recommend organizing your analysis as 
shown in Tables 5.7 and 5.8 (for reasons of space we give only two of the 
criteria). 


Pattern Example Obligatory Base allomorphy 

-$ cat/cats I I 

-$ house/houses I D 

-es ([-iz]) index/indices I D 

vowel change goose/geese I ? 

vowel change man/men I ? 

vowelchange crisis/crises I ? 

-us > -i alumnus/alumni I I (cross-formation, see (3.33)) 
-on — -a criterion/criteria I I (cross-formation, see (3.33)) 
no change sheep/sheep I I 

-en ox/oxen I I 


Table 5.7 Morphological patterns for English plural nouns 


Note that it might be difficult or impossible to apply some criteria. For 
example, it is not clear how the base allomorphy criterion should be applied 
to morphological patterns that consist of base modification, rather than 
affixation. 
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Pattern Example Obligatory Base allomorphy 
-ance/-ence __ resist/resistance D I 
maintain/maintenance D D 
-al deny/denial D I 
-üge cover/coverage D I 
-ion destroy/destruction D D 
hyphenate/hyphenation D D 
-ation consult/consultation D D 
-ication magnify/magnification D D 
-ment endorse/endorsement D I 
-y assemble/assembly D I 
nochange walk/walk D D 


Table 5.8 Morphological patterns for deverbal abstract nouns in English 


Determine whether each morphological pattern is inflectional or 
derivational. For instance, the pattern represented by cat-cats has no 
base allomorphy, and number is obligatorily expressed. Both of these are 
indicative of inflectional patterns, so the criteria lead us to categorize plural 
-s as inflectional. Conversely, deverbal —ion does not express an obligatory 
meaning, and does induce base allomorphy, so both criteria indicate that 
this pattern is derivational. The criteria need not agree, of course (see, e.g. 
endorse/endorsement and house/houses); use a ‘majority wins’ principle to 
categorize each morphological pattern as inflectional or derivational. 

Step 3: Consider the following questions. 

1) Are some criteria more reliable than others? In other words, are 
there any criteria that always indicate that a given pattern is inflectional 
when the overall majority also categorize it as inflectional (and the same 
for derivation)? And if some criteria are more reliable than others in this 
sense, does this mean that these criteria are more valuable/important for 
distinguishing between inflection and derivation? Why or why not? 

2) Can you think of another way to evaluate the merits of each criterion? 

3) Are some morphological patterns ‘more inflectional’ or ‘more 
derivational' than other patterns that express the same feature/ meaning? If 
yes, which criteria are primarily responsible for this result? How important 
do you think these properties are as data, relative to the others? Why? 

4) Return to the definitions of inherent and contextual inflection from 
Section 5.4.1: value making independent semantic contribution (inherent) 
vs. value assigned by agreement or government (contextual). Divide the 
inflectional patterns in your data according to this definition. Now look at 
the other criteria from Step 2. Are the inherent inflectional patterns more 
like derivation than the contextual inflectional patterns? In other words, is 
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the division into inherent/contextual inflection independently supported 
by the other criteria? 

5) Based on the results of Step 2 and your answers to the previous 
questions, do you view inflection and derivation as dichotomous classes, 
or end-points along a continuum of traits? Support your conclusion by 
explaining your reasoning. 


Productivity 


morphological rule or pattern is said to be productive if (and to the 

extent that) it can be applied to new bases and new words can be 
formed with it. The notion of productivity is in principle applicable both 
to word-formation and to inflection, but in this chapter we focus more on 
productivity in word-formation. 

A variety of questions arise when considering productivity: Is the 
productivity of a rule part of speakers' implicit knowledge of their 
language? What makes one rule more productive than another, i.e. what 
factors determine the likelihood of a given rule being used to create a 
new word? Also, are rules categorically productive or unproductive, or is 
productivity gradient? How can the productivity of a rule be measured, 
and how can the productivity of two rules be compared? In this chapter we 
explore possible answers to these questions. 


6.1 Speakers’ knowledge of productivity 


One might begin by asking why productivity should be such a big 
issue in morphology. After all, syntactic rules are productive as well, 
but few syntacticians worry much about how to define and determine 
their productivity (and no syntax textbook devotes an entire chapter to 
productivity). In syntax, linguists study possible sentences, and they do 
not care much whether these are actual sentences in some sense or not. 
Indeed, we could carry over this procedure to morphology and say that 
linguists who are interested in the morphological systems of languages 
should study possible words, regardless of whether these words happen 
to be actual words in common use or not. In other words, according to one 
view, morphological competence (speakers’ knowledge of the words and 
rules of the language) and morphological performance (the actual use of 
that knowledge for communication and other tasks) are conceptually quite 
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distinct and should therefore be studied separately. Linguists who adhere 
to this position have often attempted to define away the whole issue of 
productivity by treating it as purely related to performance. 

One view that denies the relevance of productivity to the study of 
competence says that productivity is exclusively a diachronic phenomenon, 
ie. one having only to do with the way in which morphological patterns 
change over the centuries. When a neologism is coined, and especially 
when it is accepted by the other speakers and becomes a usual word, this 
means that a new word enters the language and the language thereby 
changes. Thus, some linguists argue that when a strictly synchronic point 
of view is adopted, the issue of productivity does not arise. But this does 
not seem quite right. It fails to explain why some morphological patterns 
are more likely than others to be used to coin new words. Hypothetical 
words like *helpnessful (with the wrong order of the suffixes -ful and -ness) or 
*frownity (where the suffix -ity attaches to a verb) are clearly ungrammatical, 
as every speaker will agree (Aronoff 1980). Since such judgements are 
otherwise routinely used to study linguistic competence, this suggests 
that the productivity of a rule should also be considered part of speakers' 
(synchronic) knowledge of their language. 

The relevance of productivity has also been denied by trying to equate 
the productivity of a rule with its restrictedness. The set of bases to which a 
rule could apply in principle is called its domain. Whenever the domain is 
less than the entire word-class, we say that there are systematic selectional 
restrictions on the rule. On this view, all morphological rules are equally 
productive, but they are not equally restricted. Some are quite unrestricted 
(like English -ness, which attaches to almost any kind of adjective), whereas 
others are heavily restricted (like English deadjectival -en in blacken, 
redden, etc., which attaches only to monosyllabic adjectives, among other 
restrictions). However, it is also quite unlikely that this view is correct. 
There are simply too many rules that are not obviously heavily restricted 
and yet their productivity is limited. For example, the English diminutive 
suffix -let (e.g. streamlet, piglet, booklet) could in principle combine with 
any monosyllabic concrete noun, but in fact it is very rarely used for new 
words. It is, of course, possible that such unproductive rules are subject 
to restrictions that have not been discovered yet, but we must regard it as 
more plausible that there is no such direct relation between the degree of 
productivity and unrestrictedness of a morphological rule. 

Thus, we have to accept that speakers’ knowledge of a language 
comprises not only words and rules, but also the probability of a given 
rule applying to create a new word, i.e. productivity. While we do not 
fully understand what causes a morphological rule to be productive or 
unproductive, research suggests a complex interaction of grammatical, 
social/pragmatic and processing restrictions (some of these are discussed 
in Sections 6.3-6.4 below), and speakers may even acquire some of their 
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knowledge of productivity by observing how morphological patterns are 
used (and not used) in the community. It is thus a phenomenon that cannot 
be explained away, or ignored, and most linguists consider productivity an 
important topic for morphological study. 


6.2 Productivity, creativity and gradience 


When a rule is very productive, neologisms formed by that rule are hardly 
noticed — by speakers, hearers and lexicographers. For instance, English 
adjectives with the suffix -less can be formed quite freely (childless, joyless, 
shoeless, and so on), and if a speaker or writer creates a new word with 
-less (e.g. commaless: the poet writes in long, commaless sentences), this does 
not strike hearers or readers as particularly innovative. The author may not 
have noticed herself that she was using a new word. 

Some linguists have proposed that the unconscious nature of the 
formation of new words is not merely a typical property of highly 
productive rules, but should be a necessary criterion for regarding a rule 
as productive. According to this view, there is a sharp distinction between 
productivity and creativity. A productive rule allows speakers to form new 
words unconsciously and unintentionally, whereas creative neologisms 
are always intentional formations that follow an unproductive pattern. 
An example of a creative neologism would be the word mentalese (‘the 
mental language of our thoughts’), because new words with the suffix -ese 
(such as motherese, computerese, translationese) are probably always coined 
intentionally, and they immediately strike hearers and readers as new and 
unusual. (The word mentalese must have been coined by a philosopher in 
the middle of the twentieth century.) 

However, the proposed distinction between productivity and creativity 
has both a methodological and an empirical problem. The methodological 
problem is that it defines productive rule application as unconscious or 
unintentional, but we have no way of knowing what speakers' intentions 
and state of consciousness are when they form a new word. Moreover, we 
can distinguish consciousness and intentionality at several levels. When 
the philosopher coined the word mentalese, he or she probably intended to 
create a catchy single-word expression for a highly abstract concept that 
would make that concept more popular. At this level the coinage was no 
doubt conscious. But why did he or she not choose thoughtese or mindese, 
two words that would have made perfect sense to describe the language 
of our thoughts in the mind? It so happens that English words with the 
suffix -ese have a strong preference for a stress pattern strong-weak-strong 
(e.g. compüterése, motherése, translationése, and also Japanése, not "Japànése, 
Vietnamése, not *Vietnàmése), and the words thoughtése and mindése would 
not conform to this pattern (Raffelsiefen 1996). It seems unlikely that the 
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philosopher was aware of this phonological regularity, and in this sense 
the choice of méntalése instead of thoughtese or mindese was probably 
unintentional. 

The empirical problem is that there are many rules that yield neologisms 
that are neither totally unremarkable nor immediately noticed. The English 
verb-deriving suffix -ize, for example, often forms new words, so it would 
be very odd to say that it is unproductive, but it may well be that quite a few 
of these new words are conscious creations (e.g. technical scientific terms 
such as pronominalize, transistorize, multimerize). 

Perhaps the term creativity is most appropriate when it is applied to 
violations of ordinary language norms (this is called poetic licence). Poetic 
licence manifests itself as the creation of novel words by unproductive rules. 
In English, verb + noun compounds of the type killjoy are unproductive, yet 
J. Thurber used kissgranny, and G. M. Hopkins coined daredeath. In Russian, 
the denominal suffix -ač (e.g. trubač ‘trumpeter’, from truba ‘trumpet’) is 
unproductive, but V. Mayakovsky created stixaé ‘verse-maker’ (from stix 
‘verse’), and V. Khlebnikov used smexaé ‘laugher’ (from smex ‘laughter’) 
(Dressler 1981). These cases should not be completely dismissed as abnormal 
use of language by a few exceptional individuals, because their poetry is 
intended for a (reasonably) wide audience, and readers must be expected at 
least to understand the neologisms. Thus, they provide interesting evidence 
that speakers are able to recognize the structure of unproductively formed 
words, and that the rules, even if unproductive by ordinary standards, at 
least exist. But the vast majority of newly formed words are not due to 
poetic licence. 

It seems more realistic to arrange rules on a continuous scale of 
productivity than to divide them into classes of ‘productive’ and ‘creative’ 
rules, or ‘productive’ and ‘unproductive’ rules. In this book we will 
therefore say that morphological rules can be gradiently productive, and 
the less productive a rule is, the greater the chance a neologism will be 
noticed. The suffix -ese is less productive than the suffix -less, so we expect 
-ese neologisms to be more striking than -less neologisms. 


6.3 Restrictions on word-formation rules 


So what makes a rule (relatively) productive or unproductive? In many 
cases, we can give specific reasons why a word-formation rule does not 
give rise to words that it might be expected to permit. For example, the 
German female-noun suffix -in (as in Kónig-in ‘queen’, Lów-in ‘lioness’) 
does not generally combine with the names of lower animals (examples 
like Kdfer-in ‘female beetle’, Würm-in ‘female worm’ can be found in small 
numbers on the internet, but speakers are less likely to accept them, as 
compared to Königin and Lówin); and the English suffix -ity systematically 
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fails to combine with adjectives ending in -ish, -y and -ful (*hopefulity). In 
other words, the rules of -in and -ity have restricted domains. 

There is disagreement about whether such restrictions on a rule should 
be considered issues of productivity. Some linguists posit that a rule 
may be considered very productive if it routinely creates new words from 
bases within the rule’s domain (bases not within the domain are irrelevant). 
Others define productivity without regard to domain. Under this definition, 
a rule with an unrestricted domain may be productive or unproductive, 
but a rule with a very restricted domain cannot be highly productive 
because its restrictions keep it from contributing a large number of new 
words to the language. Which stance we take will affect how we measure 
productivity (see Section 6.5 for several productivity measures), but 
whichever definition we employ, it is clear that restrictions on the domain of 
a rule significantly limit the coining of new words. The kinds of restrictions 
that can be observed are discussed in this section. 


6.3.1 Phonological restrictions 


Phonological restrictions on the domain of a word-formation rule are 
particularly common with derivational suffixes, much less so with prefixes 
and compounding. In some cases, there is a straightforward reason for 
the restriction: certain complex words are impossible because they would 
create difficulties for phonetic processing (i.e. pronunciation or perception). 
A common restriction rules out the repetition of identical features, e.g. the 
repetition of the phoneme /A/ (spelled lI) in Spanish (which reduces the 
domain of the diminutive suffix -illo, (see (6.1)), or the repetition of the 
vowel /i(:)/ in English (which reduces the domain of the suffix -ee (see 


(6.2))). 


(6.1) Spanish diminutive suffix -illo 


mesa mesilla '(little) table’ 
grupo grupillo '(little) group’ 
gallo *gallillo '(little) rooster’ 
camello *camellillo ^ '(little) camel 


(Rainer 1993: 18) 


(6.2) English patient-noun suffix -ee 


draw drawee 
pay payee 
free "freeee 


accompany  “accompanyee 
(Raffelsiefen 1999a: 246) 


Somewhat similar is the requirement that the derived word must have an 
alternating rhythm (strong—weak-strong). As a result, the English suffix -ize 
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freely attaches to bases with a strong-weak rhythm, but does not attach to 
bases that end in a strong (i.e. stressed) syllable. (The suffix -ese behaves 
similarly, as we saw in the previous section.) 


(6.3) English verbalizing suffix -ize 


príoate privatize 
global globalize 
corrupt *corruptize 
secure *securize 


(Raffelsiefen 1996; Plag 1999: ch. 6) 


Phonological restrictions can also be purely random. As noted above in 
Section 6.1, the English suffix -en (e.g. blacken, redden, tighten) attaches only 
to monosyllabic bases. This is an example of a phonological restriction, but 
not one with an (obvious) phonetic motivation. 


6.3.2 Semantic restrictions 


In many cases, the meaning of an affix automatically restricts the domain 
of a word-formation rule, because some base-affix combinations simply 
make no sense. For example, it would be nonsensical to add the German 
female-noun suffix -in to a noun like Baum ‘tree’ (*Büum-in), because we do 
not conceive of trees as having gender distinctions. Similarly, the English 
reversive prefix de- (as in de-escalate, decolonize) can be combined only with 
verbal bases that denote a potentially reversible process. Combinations 
such as deassassinate or deincinerate are hard to interpret, except perhaps in 
a science-fiction context. 

However, word-formation rules may also have semantic restrictions that 
seem quite arbitrary. For example, the Russian quality-noun suffix -stvo 
combines with adjectives that denote properties of human beings, not with 
adjectives denoting physical properties of objects. 


(6.4) Russian quality-noun suffix -stvo 


bogatyj ‘rich’ bogat-stvo ‘richness’ 
znakomyj ‘acquainted’ znakom-stvo ‘acquaintance’ 
udaloj ‘bold’ udal’-stvo ‘boldness’ 
lukavyj ‘wily’ lukav-stvo ‘cunning’ 
vjalyj ‘withered’ *vjal’-stvo 


priemlemyj ‘acceptable’ —"priemlem-stvo — 
(Svedova 1980: 179) 


Here there is no intrinsic reason why the suffix -stvo should not combine 
with other adjectives. 
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6.3.3 Pragmatic restrictions 


In addition to being phonologically and semantically well-formed, a 
neologism must also be useful — this is what is meant by pragmatic 
restrictions. We noted at the beginning of this section that German do not 
generally accept female nouns in -in denoting lower animals (Kdferin ‘female 
beetle’, Wiirmin ‘female worm’). It seems clear that these gaps in the German 
lexicon are due to a pragmatic restriction: for animals like beetles and 
worms, it is simply not particularly useful to distinguish between males and 
females. Perhaps one should regard these derivations as potential German 
words, because it is not all that difficult to imagine a situation in which they 
might become useful (e.g. entomologists' specialized publications, or fairy 
tales). But ordinary speakers react to Kéferin in much the same way as they 
would to Baumin, and it is not easy to argue that the former is a possible 
word, while the latter is impossible. 


6.3.4 Morphological restrictions 


Some morphological patterns require special morphological properties 
in the base. For example, Modern Hebrew has a pattern for action nouns 
(CiC(C)uC) that is applied only to verbs of one particular inflection class 
(CiC(C)eC). Verbs of other inflection classes (CaCaC, hiCCiC, etc.) cannot 
form their action nouns in this way. 


(6.5) Modern Hebrew action-noun pattern CiC(C)uC 


diber ‘speak’ dibur ‘talk’ 

kibec ‘gather’ kibuc ‘gathering; kibbutz’ 
liked ‘unite’ likud ‘union; Likud’ 
tixnet ‘program’ tixnut ‘programming’ 
katav ‘write’ *kituv 

hamad 'desire' *himud 

hiskiv “put to bed’ *hiskuo 


In Russian, the female-noun suffix -ja combines only with bases that are 
themselves derived by the suffix -un (see (6.6)). All other nouns must use 
some other female-noun suffix (-ka, -3a, -inja, -isa). 


(6.6) Russian female-noun suffix -ja 
govor-it’ ‘talk’ ^ govor-un ‘talker’ ^ govor-un'-ja 


beg-at’ ‘run’ beg-un ‘runner’  beg-un'-ja 
pljas-at’ ‘dance’ pljas-un ‘dancer’  pljas-un'-ja 
Ig-at’ ‘lie’ Ig-un ‘liar’ Ig-un'-ja 


(Svedova 1980: 203) 


It appears that, with such nouns, the suffix -ja is 100 per cent possible, 
but since the suffix -un is not particularly common and not particularly 
productive, nouns in -ja are very rare. 
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6.3.5 Borrowed vocabulary strata 


In some languages, a large part of the lexicon consists of loanwords 
from another language that is (or traditionally has been) well known to 
many speakers, at least in some influential section of the population. 
These loanwords usually include many morphologically complex 
words. If an isolated complex word is borrowed into another language, 
its morphological structure inevitably gets lost (thus, the English word 
orangutan is monomorphemic, although this is a compound noun in the 
source language, Malay: orang ‘man’, utan ‘forest’). But when a language 
borrows many morphologically complex words from the same language, 
their morphological structure may be preserved, and their patterns may 
become productive in the target language. For example, Japanese borrowed 
many verb-noun compounds from Chinese - e.g. those in (6.7). 


(6.7) Japanese V + N compounds (borrowed from Chinese) 


doku-syo ‘reading a book’ 

kyuu-sui ‘supplying water’ 

satu-zin ‘killing a man’ 

noo-zei ‘paying tax’ 

tuu-gaku 'going to school" 

tai-kyoo 'staying in Tokyo' 

hoo-bei ‘visiting the United States’ 


(Kageyama 1982: 221-31) 


In some cases, the Chinese simple words were borrowed as well, but, in 
many others, these noun and verb stems exist only in compounds (e.g. 
bei- ‘US’ occurs only in compounds such as bei-koku [US-country] ‘United 
States’). The pattern of Chinese compounds is quite different from that of 
the corresponding native Japanese compounds, which take the form N + V 
(e.g. hito-dasuke [person-help] ‘helping people’, yama-nobori [mountain-climb] 
‘mountain climbing’). Thus, if Japanese had just borrowed a few compounds 
of the type in (6.7), they would have lost their morphological structure, 
but since they were borrowed in large quantities, these compounds are 
analyzable by Japanese speakers, and in effect Japanese borrowed the V + N 
pattern along with the compounds from Chinese. The pattern is productive 
in modern Japanese, and new compounds can be formed with it. 

However, and this is crucial in the present context, only stems borrowed 
from Chinese can be used in this compounding pattern. For example, the 
noun amerika (used with the same meaning as bei-koku) cannot be a second 
compound member (*hoo-amerika ‘visiting America’). Thus, the Chinese- 
Japanese morphological pattern is still restricted to the vocabulary stratum 
of Chinese-Japanese words. 

A similar situation can be found elsewhere. Many languages of India have 
borrowed heavily from the classical language Sanskrit and thus have many 
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derived lexemes of Sanskrit origin. In Kannada (a Dravidian language that 
is not genealogically related to Sanskrit), many Sanskrit affixes are used 
extensively, but mostly with bases that are themselves Sanskrit loanwords. 
For instance, the quality-noun suffix -te can be used freely as in (6.8), but it 
does not combine with non-Sanskrit bases such as kulla ‘short’. 


(6.8) khacita ‘certain’ khacitate ‘certainty’ 


bhadra ‘safe’ bhadrate ‘safety’ 
ghana ‘weighty’ ghanate ‘dignity’ 
kulla ‘short’ *kullate 


(Sridhar 1990: 270, 278) 


In many European languages, we find an analogous situation with 
loanwords from Latin. English has borrowed particularly extensively 
from Latin, and suffixes like -ive, -ity, -ous and adjectival -al (as in parental, 
dialectal) are mostly restricted to bases of Latin origin (these are often called 
Latinate bases, as contrasted with Germanic bases). 


(6.9) act active fight "fightive 
brutal brutality brittle *brittality 
monster monstrous spinster *spinstrous 
parent parental mother *motheral (cf. maternal) 


Now the question arises how speakers could learn whether a stem belongs 
to the native or to the borrowed stratum — after all, speakers do not acquire 
the historical information of etymological dictionaries during their normal 
process of language acquisition. In many cases, the phonological peculiarities 
of the borrowed stratum are probably of some help. Thus, in Kannada only 
Sanskrit loans have aspirated consonants (kh, bh, gh), and, in Japanese, Chinese 
loan morphemes never have more than two syllables. But otherwise the only 
way to infer that a word belongs to the borrowed stratum is by observing that 
it combines (or fails to combine) with certain affixes. 

Perhaps because of this, the restriction of a word-formation pattern 
to a borrowed stratum is often unstable. Thus, English -ous has also 
been applied to non-Latinate bases (e.g. murderous, thunderous), and the 
Kannada Sanskrit-derived suffix -maya (e.g. haasya ‘humour’, haasya-maya 
‘humorous’) has also been applied to non-Sanskrit words (e.g. lanca ‘bribe’, 
lanca-maya ‘corrupt’; influuyens ‘influence’, influuyens-maya ‘influential’ 
(Sridhar 1990: 282)). The English suffixes -able, -ize, -ify, -ism seem to have 
lost their restriction to Latinate bases almost entirely. 


6.4 Productivity and the lexicon 


While these kinds of restrictions explain why some words fail to exist 
that might otherwise be expected, we are still left with the problem that 
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morphological patterns that are largely unrestricted may nonetheless differ 
in degree of productivity. Why is this? 

This is a complicated issue, and one that is not yet fully understood. 
However, one hypothesis is that the productivity of a rule depends in large 
part on how the words that exemplify it are structured in the lexicon. (This 
also helps explain why productivity is more important to morphology than 
to syntax — both morphological and syntactic rules have restrictions, but 
only morphological structure is closely tied to the lexicon.) The main ways 
in which the lexicon is argued to influence morphological productivity are 
reviewed in this section. 


6.4.1 Processing restrictions 


There is growing evidence that productivity is tied to how words are 
mentally processed, specifically, the extent to which complex words are 
stored in the lexicon. Remember from Section 4.3 that under the hypothesis 
of a moderate word-form lexicon, many complex words (and all simple 
words) are stored in the lexicon, but some complex words are decomposed 
and only their component morphemes are stored. This difference can affect 
the memory strength of an affix — when an affix occurs often in words of 
the second type, the lexical entry for the affix will be frequently accessed, 
and as a consequence will have greater memory strength. However, an 
affix that occurs mostly in words stored in the lexicon will have relatively 
weak memory strength because mentally accessing a stored complex 
word bypasses the lexical entry for the affix. The hypothesis is that high 
memory strength makes an affix more readily available to speakers for use 
in coining new words. All else being equal, morphological patterns thus 
tend to be more productive if they predominantly occur in words that are 
decomposed, rather than stored (Hay and Baayen 2002). 

Of course, many factors influence whether a word is decomposed; see 
Section 4.3 for an overview. Here we review just one issue: relative frequency. 
Whena complex word is less frequent than the base that it contains, it will tend 
not to be stored in the lexicon, all else being equal. For instance, modernity has 
a lower token frequency than modern, so modernity is likely to be decomposed 
-the high frequency of modern makes it efficient to mentally process modernity 
via the lexical entries modern and -ity. When the complex word is more 
common than its base (e.g. security has a higher token frequency than secure), 
the complex word will tend to be stored. With regard to productivity, the key 
question then, is: What percentage of words with -ity are like modernity, and 
what percentage like security? This is the parsing ratio — the proportion of 
words with a given affix (or other morphological pattern) that are estimated 
to be stored in the lexicon according to their component parts. 

Itturns out that complex words containing -ity are likely to be stored. This 
suffix has a parsing ratio of 0.17, meaning that about 17 per cent of unique 
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complex words containing -ity are decomposed (the modernity type), and 
correspondingly, about 83 per cent of words containing -ity are stored in the 
lexicon (the security type). Parsing ratios for a sample of English suffixes are 
given in (6.10). 


(6.10) affix parsing ratio productivity (100 x P) 
0.0 


-ence 0.1 

-ity 0.17 0.1 
-ate 0.31 0.3 
-dom 0.5 0.2 
-ness 0.51 0.8 
-ish 0.58 0.5 
-like 0.68 38.1 
-proof 0.8 5.5 
-less 0.86 1.7 


(Hay and Baayen 2002: 233-5) 


A higher parsing ratio means that the affix is being activated in the lexicon 
relatively more often. Thus, the higher the parsing ratio, the more productive 
we would expect that affix to be. How to calculate the productivity measure 
P is described in Section 6.5(v) below, but here it is sufficient to note that 
larger numbers indicate greater productivity. While there is not a perfect 
correlation, the overall pattern is as expected (e.g. -ity is not particularly 
productive). 

We should be cautious in our conclusions about the influence 
of morphological processing on productivity because most of the 
relevant research has been conducted on English. The relationship 
between processing and productivity in other languages remains to be 
investigated. But this kind of processing constraint may help explain a 
long-noted but imperfect correlation between semantic and phonological 
regularity of words created by a rule, and that rule's productivity. It is 
intuitively not surprising that the rule of -th suffixation in English is 
unproductive, because many of the words containing -th are irregular 
phonologically (depth, breadth, length, youth) or semantically (wealth is not 
just ‘being well’). But some completely regular rules are unproductive (e.g. 
the female-noun suffix -ess in English: poetess, authoress, princess), and some 
highly productive patterns have a fairly large number of irregular existing 
words (e.g. German -chen diminutives, as seen in idiomatized words 
like Brötchen ‘bread roll’, not ‘little bread’, Teilchen ‘particle’, not ‘little 
part’, Weibchen ‘animal female’, not ‘little woman’, Zäpfchen ‘uvula’, not 
‘little cone’). So it does not seem quite right to say that regularity directly 
determines productivity. However, semantic and phonological regularity 
are thought to influence whether complex words are stored in the lexicon, 
so there is likely to be an indirect relationship between regularity and 
productivity. 
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6.4.2 Synonymy blocking 


Very often an otherwise productive derivational rule does not apply because 
it is pre-empted by an existing word that has the meaning of the potential 
neologism. For example, the verb to broom does not exist because the verb to 
sweep exists instead, which means the same as to broom would if it existed. 
Morphologists say that there is synonymy blocking (or just blocking, for 
short) under such circumstances. Apparently languages prefer not to have 
several words that mean exactly the same. Some other English examples are 
given in (6.11). 


(6.11) base blocked word ^ blocking word related pair 
broom *to broom to sweep hammer/to hammer 
to type *typer typist to write/writer 
gymnastics *gymnastician gymnast statistics/statistician 
good *goodly well bad/badly 


As the examples show, it is immaterial whether the blocking word is 
morphologically related to the blocked word or not. 

A puzzling fact about blocking is that it has many exceptions. For 
instance, English has synonymous pairs like piety/piousness, curiosity/ 
curiousness, accuracy/accurateness, etc. (Plank 1981: 175-80), in which one 
would expect the second member to be blocked by the first one. Here, one 
relevant factor may be the token frequency of the blocking word: the more 
frequent the blocking word is, the greater is its blocking strength (Plank 
1981: 182; Rainer 1988). Since the effect of frequency is relative, it is best to 
compare a range of cases that are structurally identical but differ in token 
frequency. We will look at quality nouns in Italian and German (Rainer 
1988: 167-71). 

The Italian quality noun suffix -ità is generally productive with adjectives 
ending in -oso such as furioso ‘furious’, furiosita 'furiousness'. However, 
when the adjective in -oso is itself derived from a non-derived quality noun, 
this noun has the same meaning as the (potential) derivative in -ita, and is 
thus potentially subject to synonymy blocking. For example, the adjective 
bisognoso ‘needful’ (derived from bisogno ‘need’) does not form a quality 
noun "bisognosità, because this would have the same meaning as bisogno and 
is thus blocked by it. However, the blocking effect is not always observed. 
For instance, malizioso ‘malicious’ forms mializiosita 'maliciousness', 
although its base malizia ‘malice’ has the same meaning. When we look at 
the frequencies of a range of cases, we see that only the more frequent words 
have the blocking effect. In (6.12), the last column gives the frequency of the 
blocking word as determined by a frequency dictionary. (The frequency 0 
means that the corpus is not large enough to contain a token of the word, 
not that the word does not exist.) 
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(6.12) base potentially blocking its frequency 
blocked word word 
coraggioso *coraggiosita ^ coraggio ‘courage’ 52.70 


pietoso *pietosita pieta ‘pity’ 34.04 
desideroso "desiderosità ^ desiderio ‘desire’ 31.92 
fiducioso "fiduciosità fiducia 'confidence' 30.79 
orgoglioso "orgogliosità orgoglio ‘pride’ 10.64 
armonioso armoniosità armonia ‘harmony’ 4.13 
rigoroso rigorosita rigore ‘rigour’ 3.42 
malizioso maliziosita malizia ‘malice’ 0 


acrimonioso acrimoniosita | acrimonia ‘acrimony’ 0 
parsimonioso parsimoniosita parsimonia ‘parsimony’ 0 
ignominioso ignominiosita ignominia ‘ignominy’ 0 


The frequency effect on blocking strength can also be observed when 
a productive quality noun rule competes with unproductive quality 
noun formations. In German, the suffix -heit ‘ness’ has all monosyllabic 
adjectives in its domain, but it is blocked when a different quality noun is 
available, e.g. *Reichheit ‘richness’ from reich ‘rich’ is blocked by Reichtum 
‘wealth’, which uses the unproductive suffix -tum. Again, the frequency of 
the blocking word is decisive, as shown in (6.13). 


(6.13) base potentially blocking its frequency 
blocked word word 
alt ‘old’ *Altheit Alter ‘(old) age’ 1400 
grof ‘big’ *Gropheit Größe ‘size’ 1301 
tief ‘deep’ “*Tiefheit’ Tiefe ‘depth’ 613 
warm ‘warm’ "Warmheit Wärme ‘warmth’ 520 
frisch ‘fresh’ *Frischheit Frische ‘freshness’ 107 
eng ‘narrow’ *Engheit Enge ‘narrowness’ 67 
blass ‘pale’ *Blassheit Blässe ‘paleness’ 23 


schnell ‘quick’ *Schnellheit Schnelle ‘quickness’ 23 


The explanation for the frequency effect on blocking strength is that 
frequent words have greater memory strength and are therefore retrieved 
faster from memory than rare words. When a German speaker wants to 
say ‘warmth’, she has two options: applying the productive rule of -heit 
suffixation or retrieving an existing word with that meaning — i.e. Wärme. 
Since Würme is very frequent and thus easy to retrieve from the lexicon, it 
will win out in this case. When the existing word is rare, it has a relatively 
weak memory strength, and the process of forming a new word may be 
faster, so no blocking is observed (see Anshen and Aronoff 1988). Blocking 
thus represents both a semantic restriction on productivity (the desire of 
the language not to have two words with identical meaning), and also a 
constraint imposed by the lexicon (words with stronger representations in 
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the lexicon create stronger blocking effects because they are more easily 
accessed). 


6.4.3 Productivity and analogy 


Finally, this book is primarily about synchronic morphology, the nature of 
morphological patterns as they function in a particular language at a given 
time. But, in order to understand synchronic patterns better, it is sometimes 
useful to consider the diachronic aspect of morphology, in particular here, 
analogical change. 

An analogical change is said to occur when speakers form a new word 
on the model of (or by analogy with) another word. For instance, the English 
verb fling used to have a past tense formed with the suffix -ed (flinged), but 
at some point the past-tense form flung was created, clearly on the model of 
verbs like sting/stung. In order to show clearly what happens in analogical 
change, linguists often use proportional equations as in (6.14). The two 
terms on the left-hand side of the equation represent the model, and the X 
on the right-hand side represents the target word that was newly created 
by analogy. 


(6.14) sting : stung = fling : X 
X = flung 


This is an example of analogical extension, which involves an existing 
morphological pattern being extended to a new lexeme. The i/u pattern 
was extended to the lexeme FLING, which was ‘new’ in the sense that it did 
not previously exhibit this pattern. Another example is the Polish plural 
suffix -owie. Originally this suffix occurred only with a few nouns (those 
belonging to the u-declension), e.g. syn ‘son’, plural synowie ‘sons’. But later 
it was extended to quite a few other nouns denoting male humans, e.g. pan 
‘lord, sir’, plural panowie (earlier plural form pany). 


(6.15) syn: synowie = pan : X 
X = panowie 


Analogical extension also occurs in derivational morphology. For instance, 
on the model of pairs of French loanwords such as changev, changeablea, 
adjectives in -able were formed from native English words like wash: 


(6.16) change : changeable = wash : X 
X = washable 


Analogical extension can thus create a new lexeme, thereby enriching the 
lexicon, or it can lead an existing lexeme to switch from one inflectional 
pattern to another. One way to look at analogical change is as the diachronic 
result of synchronic productivity. 
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The notation in terms of proportional equations suggests that a single 
word pair served as the model for the change, but in fact, there is no 
particular reason to assume that. The left-hand part of a proportional 
equation is best understood as a general pattern, a word-schema, rather 
than as a single word. If that is the case, then a formula such as sting/stung 
becomes virtually indistinguishable from word-based rules of the kind we 
have already encountered many times. 

In fact, thinking of the pattern (and all of the words that exemplify it) as 
the analogical model helps explain why patterns with low type frequency 
can be productive. Reconsider fling /flung. At first glance, a switch from the 
-ed pattern to the i/u pattern may seem odd. Why should a lexeme be drawn 
away from a pattern that is quite productive and has high type frequency, 
and begin to exhibit a pattern that applies to fewer than two dozen words? 
The key observation is that the i/u pattern is dominant among words that 
are phonologically like the target; see (6.17). 


(6.17) /(s)\C(C)n/  cling/clung, fling/flung, sling/slung, string/strung, 
wring/wrung, sting/stung, swing/swung 
/(s)C(C)ink/ slink/slunk, shrink/shrunk (or shrank), stink/stunk 
/(s)C(C)in/  spin/spun, win/won 
/(s)C(C)ig/  dig/dug 
/(s)C(C)ep/ hang/hung 
/(s)C(C)uk/ strike/struck 
/(s)C(C)ik/ .stick/stuck 
/(s)C(C)ik/ . sneak/snuck (or sneaked) 


These words are all highly similar in phonological form, and all share a 
morphological pattern. Linguists call such a group of words a lexical gang. 
Lexical gangs often have fuzzy boundaries; the correspondence in (6.18) 
describes the prototypical members of the i/u gang. Peripheral members of 
the gang differ from this prototype to varying degrees. 


(6.18) | /(s)C(C)iy/v /(s)C(C)an/v 
‘x! ‘x! 
TENSE: PRESENT TENSE: PAST 


> 


According to a reverse dictionary of Modern English, seven of thirteen 
verbs matching the left schema in (6.18) have the i/u pattern (sometimes 
alongside another pattern). Thus, while ding/dinged (and sing/sang, and 
bring/ brought) are all possible analogical models, they are relatively unlikely 
because the i/u pattern has a high type frequency among verbs with the 
shape /(s)C(C)iy/. And as a result, the i/u pattern is a strong analogical 
model in this particular phonological context, despite being of much lower 
type frequency overall than the -ed pattern (Bybee and Moder 1983). 

We can conclude from this that the term analogical change emphasizes a 
diachronic outcome, whereas the term productivity emphasizes synchronic 
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structure, but the solution of an analogical equation is in practical terms 
the same as the application of a (productive!) word-based rule to a novel 
lexeme. Moreover, the analogical perspective highlights that productivity 
is directly tied to words in the lexicon, because these words help determine 
the strength of the pattern as a model for analogical extension. 


6.5 Measuring productivity 


We have seen that productivity is best regarded as a gradable property of 
morphological rules. Thus, for each rule we may want to ask how productive 
it is—i.e. we want to measure the degree of productivity of word-formation 
rules. 

In syntax, acceptability judgements are widely taken to be direct 
reflections of speakers’ competence, and are used to distinguish possible 
and impossible sentences. However, speakers tend to be more reluctant 
to accept new words than to accept new sentences, maybe because they 
do not encounter new words very often in ordinary life. Consider the 
set of adjectives bearded (having a beard), winged (having wings), pimpled 
(having pimples), eyed (having eyes). The last word in this set, eyed, seems 
odd, and speakers may judge it unacceptable. But does that mean that it 
is truly ungrammatical - i.e. not allowed by the morphological system? A 
straightforward explanation of the difference in the acceptability of bearded, 
winged and pimpled, on the one hand, and eyed, on the other, is that not all 
creatures have beards, wings and pimples, but virtually all have eyes, so one 
would rarely describe a person or an animal as eyed. But consider a context in 
which cave-dwelling bugs, worms and other lowly creatures are discussed, 
and the focus is on whether they have eyes or not. In such a context, the use 
of eyed suddenly becomes much more plausible, and, confronted with this 
context, speakers would perhaps reverse their acceptability judgements. So 
just what these judgements mean is not always obvious in morphology. 
For this reason, most morphologists are interested in actual words when 
measuring productivity, in addition to speakers’ acceptability judgements 
of hypothetical words. 

Various measures of productivity have been proposed, but it turns out 
that they measure rather different things. 

(i) The number of actual words formed according to a certain pattern 
(also called degree of generalization, profitability of a pattern or type 
frequency). This is an interesting concept, and it is fairly easy to measure 
by examining a comprehensive dictionary (though, of course, this works 
only to the extent that the dictionary faithfully records all and only the 
actual words of the language). However, type frequency is not the same as 
productivity: according to this measure, the English suffix -ment has a high 
type frequency (English has hundreds of words like investment, harassment, 
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fulfilment), but it is not productive - only four neologisms with -ment are 
attested in the OED for the twentieth century. Conversely, there are not 
many usual words with the suffix -ese (as in journalese), but this can be used 
freely to coin new words denoting a special language or jargon. 

(ii) The number of possible words that can be formed according to a 
certain pattern. This concept is much more difficult to measure, because 
it requires that we correctly identify all the restrictions on the pattern. But 
even then it is unlikely that the set of possible words equals the likelihood 
that a new word can be coined. There are simply too many cases of (more 
or less) unproductive rules that do not seem to be restricted in any general 
way. For instance, en-/em- prefixation in English should be possible with 
any noun that denotes a container-like object (e.g. entomb, ensnare, embody), 
but the rule is simply not productive (cf. *embox ‘put into a box’, *encar ‘put 
in a car’). 

(iii) The ratio of actual words to possible words (also called the degree 
of exhaustion) (Aronoff 1976). Again, this requires that we be able to 
count the number of possible words, so it is not very practical. Moreover, 
when the possible bases include complex words that are themselves 
formed productively, the set of possible words becomes open-ended, and 
computing the ratio of actual to possible words is not really meaningful. 
For example, English or German N + N compounds can be formed freely 
without restrictions, and the compound members may be compounds 
themselves (see Section 7.1). Thus, the set of possible N + N compounds is 
staggeringly large (in principle, infinite), so the degree of exhaustion for N 
* N compounds is necessarily quite low (even though there are plenty of 
actual N + N compounds, and the pattern is highly productive). 

(iv) The number of neologisms attested over a certain period of time 
(also called diachronic productivity). This measure can be determined if 
a good historical dictionary is available (such as the OED), but again only 
to the extent that the dictionary is reliable. And we saw earlier that if a 
pattern is very productive, lexicographers are likely to overlook new words 
with this pattern. Another technique uses large text corpora. By looking at 
a newspaper corpus of recent decades, it should be possible, for instance, 
to observe how the English semi-suffix -gate (as in Watergate, Irangate, etc.) 
gained (and perhaps lost) productivity over the years. 

(v) The ratio of hapax legomena with a given pattern to total token 
frequency of words with that pattern (called the P measure, or the category- 
conditioned degree of productivity) (Baayen and Lieber 199; Baayen 1993). 
This measure stems from the observation that productive morphological 
rules are likely to produce occasionalisms, whereas unproductive rules 
are not likely to do so. Occasionalisms are thus particularly important for 
determining productivity. A hapax legomenon is a word that occurs only 
once in a corpus; in a very large corpus hapax legomena can be assumed 
to be occasionalisms. The category-conditioned degree of productivity is 
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calculated by dividing the total number of hapax legomena in the corpus 
exhibiting a given morphological pattern by the total number of word 
tokens in the corpus that exhibit that same pattern: 


(6.19) Category-conditioned degree of productivity 
P=Vin/ Nua 


Here, V,. is the number of hapax legomena in a corpus that exhibit 
morphological pattern m. N, is the collected token frequency of all 
words exhibiting pattern m. The ratio captures the likelihood that a word 
randomly drawn from the corpus and exhibiting the relevant pattern will 
be an occasionalism. 

(vi) The ratio of hapax legomena with a given pattern to all hapax 
legomena (called the 7* measure, or the hapax-conditioned degree of 
productivity) (Baayen and Lieber 1991; Baayen 1993). This measure also 
relies on the tendency of productive rules to create occasionalisms, but 7* 
measures the extent to which a given morphological pattern contributes to 
the total growth rate of the vocabulary. In this measure, the number of hapax 
legomena exhibiting a given morphological pattern is divided by the total 
number of hapax legomena in the corpus (with all morphological patterns): 


(6.20) Hapax-conditioned degree of productivity 
P*=V ad Va 


Vi is the total number of hapax legomena in the corpus. This is thus similar 
to method (iv), but has the advantage that it measures productivity at a 
given moment in time, and a good historical dictionary is not needed. 

All of these measures are most useful for determining the relative 
productivity of patterns that serve similar functions, for example, -ness vs. 
-ity in English. It is often not clear how productivity measures should be 
interpreted when taken in isolation, or when comparing unlike morphological 
patterns (e.g. one that creates an adverb and one that creates a noun). 


Summary of Chapter 6 


The productivity of a morphological rule is a measure of the extent 
to which it can be used to create new words. Productivity is often 
regarded as a phenomenon that exclusively concerns language 
use (performance) or language change, but, in the view defended 
here, productivity is one part of speakers' knowledge of language 
(competence). Productivity itself must be an object of morphological 
study. 
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Morphological patterns can be arranged on a scale from totally 
unproductive to highly productive. A rigid dichotomy between 
creativity and productivity does not seem to be very useful, because 
there are always intermediate cases. The productivity of a pattern may 
be limited in various ways: phonologically, semantically, pragmatically 
and morphologically. Sometimes a pattern is productive only within a 
borrowed vocabulary stratum. 

The structure of the lexicon is also a likely factor. Patterns with 
high memory strength (by virtue of being frequently accessed in the 
lexicon) tend also to be more productive. High frequency words can 
block newly coined words with the same meaning. Finally, lexical 
gangs and other patterns with low token frequency can be (quasi-) 
productive because they are phonologically densely clustered in the 
lexicon. 

Various quantitative measures of productivity have been proposed; 
the corpus measures based on hapax legomena (P and 7") are 
increasingly used. 


Further reading 


Excellent recent discussions of issues surrounding productivity are found in 
Plag (1999) and Bauer (2001b) (see also Kastovsky (1986) and Dressler and 
Ladányi (2000)). Bauer (2005) looks particularly at the history of theories of 
productivity. 

The view that competence and performance should be strictly separated 
is expressed in Di Sciullo and Williams (1987). Aronoff (1980) argues that 
productivity is relevant to competence and to the synchronic study of 
morphology more generally. The distinction between productivity and 
creativity is proposed in the classical paper Schultink (1961) (see also van 
Marle 1985). On productivity as a scalar notion, see Bauer (1992). 

The non-distinctness of analogy and morphological rules is pointed out 
in Becker (1990) and Blevins and Blevins (2009); see also Bybee (1985, 1988). 
Perhaps the most detailed analogical approach to morphological rules and 
productivity is Skousen (1989, 1992). The logical extreme of the analogical 
approach is connectionism. Classic articles in this framework are Rumelhart 
and McClelland (1986), MacWhinney and Leinbach (1991) and Daugherty 
and Seidenberg (1994). 

Blocking and its relation to frequency are discussed by Rainer (1988) and 
Anshen and Aronoff (1988). Aronoff (2007) shows that semantic regularity 
is not necessary for the creation of neologisms (which further implies that 
regularity and productivity are not quite the same thing). 
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For better or worse, a disproportionate amount of work on productivity 


has focused on English past tense inflection. In addition to the connectionist 
papers listed above, see in particular Bybee and Slobin (1982), Bybee and 
Moder (1983), Pinker and Prince (1988), Derwing and Skousen (1994) and 
Albright and Hayes (2002). 


Comprehension exercises 


1. 


The productivity of the suffix -ity in English is heavily restricted (see 
the examples below). What might be the nature of the restriction, and 
into which of the categories of Section 6.3 does it fall? 


electric electricity bountiful "bountifulity 
probable probability sonorant *sonorantity 
captive captivity aimless "aimlessity 
curious curiosity darkish "darkishity 
abnormal abnormality fearsome "fearsomity 


Recall Exercise 1 of Chapter 4. Of the words listed there, you have 
probably characterized reknow and happytarian as impossible words in 
English, although the affixes re- and -(t)arian are widely attested and 
productive in English. What is it about the nature of these affixes that 
makes them unsuitable for these bases? (In other words, in what way is 
their productivity restricted?) 


Modern Greek has two action-noun suffixes, -simo and -ma, which 
are both productive, but in different, complementary domains. Try 
to extract a generalization from the following examples that predicts 
when -simo occurs and when -ma is used. (Note that the phonological 
stem alternations are irrelevant.) 


VERB MEANING ACTION NOUN MEANING 
djavázo ' read" djávasma 'reading' 
kóvo ‘Tcut’ kópsimo ‘cutting’ 
lúzo ‘I bathe’ lúsimo ‘bathing’ 
mangóno ‘I squeeze’ mángoma ‘squeezing’ 
pjáno "I seize pjásimo ‘seizing’ 
skondáfto ‘I stumble’ skóndama ‘stumbling’ 
tinázo 'I shake" tinayma ‘shaking’ 
tréxo ‘Trun’ tréksimo ‘running’ 


Which of the following words are impossible in the given meaning 
because of synonymy blocking? For words that cannot be explained as 
blocking, what is the reason for their impossibility? 
*musting (e.g. I hate musting get up every morning.) 
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"foots (e.g. Bobby played outside and has dirty foots now.) 

*cooker (e.g. This meal is superb. The cooker is a real artist.) 
"bishopdom (e.g. The bishop often travels through his bishopdom.) 
"teacheress (e.g. Our teacheress is a very competent woman.) 

"slickize (e.g. The Zambonis slickize the ice before the skaters compete.) 
*certainness (e.g. Nowadays there is less certainness about church teachings.) 
"sisterlily (e.g. She embraced her sisterlily.) 


5. How did the suffixes (or perhaps bound roots) -(er)ati and -scape come 
into being? Consider the following examples: 
literati, glitterati, liberati, chatterati, soccerati, digiterati (Kemmer 2003) 
landscape, seascape, cloudscape, skyscape, waterscape, winterscape (Aldrich 
1966). 


Exploratory exercise 


Acceptability judgements are used less often in morphology than in syntax. 
Morphologists tend to study actual words, rather than potential ones, and 
the measures of productivity introduced in Section 6.5 reflect that. But 
acceptability judgements are still important to morphology and are, for 
example, cited in many places in this book, so we might wonder about the 
relationship between acceptability judgements and productivity measures. 
Doacceptability judgements and productivity scores measure the same thing? 
In this exercise, you will explore and compare the two types of measures. 

The instructions below use Russian for demonstration purposes, but this 
exercise could be conducted using any language for which a good electronic 
corpus or frequency list is available. Readers with basic programming 
skills will find it easier and more accurate to calculate productivity based 
directly on a corpus that is available electronically! However, in the 
following instructions we assume that existing frequency lists or frequency 
dictionaries (i.e. a list/dictionary of words found in a corpus, with counts 
of how often each word occurs) will be used. 


Instructions 


Step 1: For the language you will investigate, find a frequency list or 
frequency dictionary, preferably one generated from a large corpus that 
includes a wide variety of textual and/or spoken data. The Russian 
examples in (6.21) and (6.22) were taken from a frequency list containing 
approximately 32,000 unique lexemes, which in turn was generated from a 
moderately sized corpus of modern Russian (about 16 million word tokens), 
from a variety of sources (Sharoff 2002). 


!  TheLinguistic Data Consortium is a major distributor of corpora: www.ldc.upenn.edu. 
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Step 2: Pick 2-5 morphological patterns to study. The best choices will 
be derivational patterns that express the same meaning, or a very similar 
meaning, and intuitively seem to differ in productivity. (Derivational 
patterns are desirable, as opposed to inflectional ones, because inflectional 
patterns are more uniformly productive.) Two Russian patterns that meet 
these criteria are shown in (6.21)? 


(6.21) pattern abstract noun base adjective 


-stvo bogat-stvo ‘wealth’ bogat-yj ‘wealthy’ 
p'jan-stoo ‘drunkenness’  p'jan-yj ‘drunk’ 

-ost’ naivn-ost’ ‘naiveté’ naivn-yj ‘naive’ 
hrabr-ost’ ‘bravery’ hrabr-yj ‘brave’ 


Step 3: Calculate the hapax-conditioned degree of productivity (P *). This 
will require you to search your list or dictionary for all hapax legomena, 
regardless of morphological pattern, and record the total number that 
occur. (Most frequency lists are sorted from most frequent to least frequent, 
making this a relatively simple task.) Then, within the list of hapaxes, search 
for all words containing the chosen morphological pattern. For instance, 
of the 5,133 least frequent words in the Frequency Dictionary of Russian, 
44 words contained -stvo and 118 contained -ost’. Some examples of these 
hapaxes are listed in (6.22). 


(6.22) a. bescinstvo “excess, enormity’ 

svjatotatstvo ‘sacrilege’ 
provorstvo ‘adroitness, dexterity’ 
dissidentstvo ‘nonconformism (rel.)’ 

b. prazdnost’ ‘idleness’ 
brennost’ ‘mortality’ 
neperenosimost’ ^ 'unbearability' 
krasivost’ ‘beauty’ 


P*(stvo) = 44/5133 = 0.0086, and P*(ost) = 118/5133 = 0.023. Thus, -ost is 
approximately two and a half times more productive than -stvo according 
to the hapax-conditioned degree of productivity. 

Note: The accuracy of hapax-based productivity scores depends in 
part on the size of the corpus. Logically, a word that occurs only once in 
a 2 million-word corpus might occur about 10 times in a 20 million-word 
corpus (and thus it would not be counted as a hapax in the larger corpus), 
and a word that is too rare to be attested in a 2 million-word corpus might 


? It is helpful if the sequence of letters in the chosen affixes is not likely to produce many 
false hits. For instance, the English suffix -y creates abstract nouns from verbs (e.g. assemble 
— assembly), but it would be a poor choice because many adjectives and other words also 
end in the letter y (e.g. happy). By contrast, in English words ending in the letters ation, this 
is almost always a suffix (e.g. consult — consultation). 
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be a hapax in a larger corpus. And so presumably, the larger the corpus, the 
more likely it is that hapaxes will be true occasionalisms that are produced 
by the productive rule. For practical reasons, many frequency lists give only 
the words that occur at least a certain number of times in the source corpus. 
But assuming that such a list is large enough to include many words that 
are rarely used, it should be possible to treat the most infrequent words on 
the list as if they were ‘true’ hapaxes, even if they occur more than once. 
This sacrifices some degree of accuracy, but in many respects it is equivalent 
to using a smaller corpus. 

Step 4: Based on the results of Step 3, develop predictions about speakers’ 
acceptability judgements. Do you expect speakers to be more likely to 
accept possible words when the affix is more productive? Less likely? No 
relation? Do you expect speakers to give potential words exhibiting the 
more productive patterns higher ratings on an acceptability scale? Why or 
why not? Explain your reasoning. 

Step 5: Test your predictions with native speaker informants. Make a list 
of at least 10 possible words for each of the affixes. These test words should 
be built on real bases, and should obey the restrictions on the domain of 
the rule, but not be actual words. (See Section 6.3 for discussion of domain 
restrictions.) For instance, Russian has the adjectives matovyj ‘matted, 
suffuse (of light)’ and laskovyj ‘tender, gentle’, but the derived abstract 
nouns matovost’ and laskovost’ are not actual words. They can be considered 
possible words. Devise a ratings scale and ask at least three native speakers 
to judge the acceptability of these test words. 

Step 6: Analyze the data. Determine whether there is a correlation 
between the productivity scores and speakers’ acceptability judgements. Do 
speakers’ judgements match your predictions? If not, develop hypotheses 
about factors that might influence one measure, but not the other. Also 
consider the possible impact of your data collection methodology on your 
results, for both types of measures. Finally, think about the relationship 
between speakers’ acceptability judgements and hapax-conditioned 
productivity. Do they measure the same thing, or different things? Explain 
your reasoning. 


Morphological 
trees 


n this chapter we will see that various kinds of morphologically complex 

words can be thought of as having hierarchical structure. In this respect, 
morphological structure resembles syntactic structure, and the ways in 
which morphological and syntactic structure are similar and different are 
important. At issue is the degree to which syntactic principles govern word- 
level structure. Until now, we have largely assumed that morphological 
rules exist in a separate area of the grammar from syntactic rules, with each 
area subject to its own principles. However, strong commonalities between 
morphological and syntactic structure might suggest that the grammatical 
system is not divided in this way. In this chapter we assess the evidence 
related to hierarchical morphological structure. Hierarchical structure is 
quite evident in compound words, and less so in derivationally derived 
words. Thus, we will start by examining compounds in some detail. 


7.1 Compounding types 


A compound is a complex lexeme that can be thought of as consisting of 
two or more base lexemes. In the simplest case, a compound consists of two 
lexemes that are joined together (called compound members). Some examples 
from English are given in (7.1). English allows several types of combinations 
of different word-classes (N: noun, A: adjective, V: verb), but not all such 
combinations are possible. 


(7.1) English compounds: some examples' 


' Note that the spelling of English compounds is inconsistent: often they are written as a 
single word, but in many other cases (especially with N + N compounds), the constituents 
of the compound are separated by a space, like syntactic phrases (e.g. sugar plantation, 
morpheme lexicon). These spelling differences are irrelevant in the present context and 
should be ignored. 
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N+N lipstick (lipN + stickn) 
A+N hardware (hardA + waren) 
V+N drawbridge (drawv + bridgen) 
N+V babysit (babyw + sitv) 
N+A leadfree (leadN + frees) 
A+A bitter-sweet (bittera + sweetA) 


Compounding rules may differ in productivity. In English, the N + N 
pattern is extremely productive, so novel compounds are created all the 
time and are hardly noticed. By contrast, the V + N pattern is unproductive 
and limited to a few lexically listed items, and the N + V pattern is not really 
productive either. For instance, one cannot say to hair-wash ‘wash one’s 
hair’, and the small handful of examples like babysit from Section 3.2.2 are 
mostly backformations from nouns, and are not produced directly by N + 
V compounding rules. 

However, there are many languages (especially morphologically rich, 
polysynthetic languages) that do allow compounds in which the notional 
object and the verb form a compound. Such compounding processes 
are called noun incorporation (metaphorically we say that the object is 
incorporated into the verb). An example from Alutor is given in (7.2). (For 
more on noun incorporation, see Section 11.2.1.) 


(7.2) gamma ta-moang-ilgatav-ak 
I 1SG-hand-wash-1sc 
‘I washed (my) hands.’ (Lit.: ‘I hand-washed.’) 
(Koptjevskaja-Tamm and Muravyova 1993: 298) 


In a compound that consists of two lexemes, it is really the lexeme stems 
that are combined, not inflected forms. In this respect compounding is no 
different from derivational affixes, which attach to stems. Thus, we get 
English compounds such as lipstick (not *lipsstick), although it is used for 
both lips, and child support (not *children support), even if several children 
are supported. While we have already seen some examples in which the 
first lexeme of a compound is inflected (e.g. publications list from Section 
5.3.7), this is not common. 

That the first compound member is almost always a stem, not an inflected 
word-form, can be seen most clearly in languages with richer inflection, 
such as Sanskrit. In Sanskrit, the first compound member in N + N/A 
compounds shows a vowel-final (or -r-final) form that does not occur as 
a member of the inflectional paradigm - this can thus be regarded as the 
pure stem. 


(7.3) deva-sena- ‘army of gods’ (devah ‘god’) 
pitr-bandhu- ‘paternal relation’ (pita ‘father’) 
pati-justa- ‘dear to the spouse’ (patih’spouse’) 
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In German, many compounds even have a special semantically empty 
suffix (sometimes called an interfix) on the first compound member, which 
forms the stem that is appropriate for compounding. Some examples are 
shown in (7.4). 


(7.4) German compounds with interfixes 


Volk-s-wagen lit. people's car’ (Volk ‘people’ + Wagen ‘car’) 
Liebe-s-brief "love letter" (Liebe ‘love’ + Brief ‘letter’) 
Schwan-en-gesang ‘swansong’ (Schwan ‘swan’ + Gesang ‘song’) 


That the first member of a compound is a stem rather than a particular 
word-form is also clearly seen in German V + N compounds, as in (7.5). 


(7.5) German V + N compounds 
Wasch-maschine ‘washing machine’ 


(wasch-en ‘wash’ + Maschine ‘machine’) 
Schreib-tisch (writing) desk’ 

(schreib-en ‘write’ + Tisch ‘desk, table’) 
Saug-pumpe ‘suction pump’ 

(saug-en ‘suck’ + Pumpe ‘pump’) 


The elements wasch-, schreib- and saug- must be pure stems, because almost 
all word-forms of verbs have special suffixes (the suffix -en in (7.5) is the 
infinitive (and citation-form) suffix). The only suffixless word-form is the 
imperative, but it would not make sense semantically to claim that wasch in 
Waschmaschine is the imperative form of the lexeme WASCHEN. 

From the point of view of semantics, not much needs to be said about 
the compounds that we have seen so far. The first compound member 
generally serves to modify and narrow the meaning of the second 
compound member. Thus, a lipstick is a special kind of stick (not a special 
kind of lip), a drawbridge is a special kind of bridge and a love letter is a 
special kind of letter. Since semantically the second member is in this sense 
more important, it is referred to as the semantic head of the compound, 
and the modifying element is called the dependent. Examples like lipstick 
belong to the endocentric type of compound (the term endocentric means 
that the semantic head (or centre) of the compound is ‘inside’ (endo-) the 
compound). In endocentric compounds, the meaning of the entire word is a 
subset of the meaning of the lexeme that serves as the head. In English, the 
semantic head of an endocentric compound is always the second member, 
but in other languages such as Spanish, the head is the first member. 


(7.6) hombre-rana ‘frogman’ (hombre ‘man’ + rana ‘frog’) 
año luz ‘light year’ (año ‘year’ + luz light’) 
pez espada ‘swordfish’ (pez ‘fish’ + espada ‘sword’) 


The semantic relations that obtain between the head and the dependent 
in compounds are quite diverse: purpose (writing desk, lipstick), appearance 


140 CHAPTER 7 MORPHOLOGICAL TREES 
perm — HP a Ó —M T€ ———————— 


(hardware, swordfish), location (garden chair, sea bird), event participant (e.g. 
agent: swansong, patient: flower-seller), and so on. There thus seem to be almost 
no restrictions on the kinds of semantic relations that may hold between the 
dependent and the head in compounds (at least in the languages in which 
compound meanings have been studied extensively). It is our knowledge 
of the world that tells us that a flower-seller is someone who sells flowers, 
and that a street-seller is someone who sells something on the street. But it 
is easy to imagine a world (say, a fable about commercially active bees) in 
which selling goes on on flowers, and even easier to imagine a world in 
which people specialize in selling entire streets. English morphology does 
not seem to say more than that the dependent must be in some kind of 
pragmatically sensible relation to the head. 

Not all compounds are of the endocentric type. Compounds may also 
be exocentric (ie. their semantic head is 'outside' (exo-) the compound). 
Exocentric compounds can be illustrated with examples from Ancient Greek. 


(7.7) kakó-bios 'having a bad life' 
(kakós ‘bad’ + bios ‘life’) 
polu-phármakos ‘having many medicinal herbs’ 
(poltis ‘much’ + phármakon ‘herb’) 
héduoinos ‘having sweet wine’ 
(hédtis ‘sweet’ + oinos ^wine") 
megaló-psukhos ‘having a large mind, i.e. magnanimous’ 


(mégas ‘large’ + psukhe ‘mind’) 


A compound such as hédtioinos refers to someone who has sweet (hédti-) 
wine (oino-), so it denotes a kind of person, not a kind of ‘sweet’ nor a 
kind of ‘wine’. The semantic head is ‘outside’ the compound: the reference 
to ‘someone’ must be inferred from the structure as a whole - there is 
no morpheme that refers to a person or to ownership. English has a few 
exocentric A+ N compounds of this semantic type (redhead ‘someone who 
has red hair', highbrow, lazybones), but this pattern is hardly productive in 
English. 

Another type of exocentric compound is illustrated by the Italian 
examples in (7.8). 


(7.8) portabagagli ‘trunk’ (portare ‘carry’ + bagagli ‘luggage’) 
lavapiatti ‘dishwasher’ (lavare ‘wash’ + piatti ‘dishes’) 
asciugacapelli ‘hair dryer’ (asciugare ‘dry’ + capelli ‘hairs’) 


Here the ‘external’ semantic head is an instrument for carrying out an 
action on an object. Again, English has a few exocentric V + N compounds 
as well (referring to people rather than instruments: pickpocket, cutthroat, 
killjoy), but this pattern is totally unproductive in English. 

Using our word-based notation of Section 3.2.2, the rules that yield these 
exocentric compounds can easily be represented formally. 
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(7.9) Rule for Italian exocentric compounds of (7.8) 


ue o 
^ys' 


& / XY / N.sG 


‘instrument for doing, ys’ 


/ Xre/ v.INF 
‘dox’ 


Here the compound word-schema on the right contains the additional 
meaning element ‘instrument for’, which is not associated with a particular 
element of phonological form, but with the pattern as a whole (cf. the rule 
in (3.30), which is similar in this respect). 

Besides endocentric and exocentric compounds, there are also compounds 
that have more than one semantic head. In these compounds, each member 
has a separate referent. Both members are on an equal footing, and they 
can be paraphrased with ‘and’, so they are called coordinative compounds. 
Some examples from Korean are in (7.10). 


(7.10) elun-ai 'adult and child’ (elun ‘adult’ + ai ‘child’) 


ma-so ‘horses and cattle’ (ma ‘horse’ + so ‘cow’) 

non-path ‘farm’ (non ‘rice field’ + path ‘dry field’) 
o-nwui "brother and sister’ (o ‘brother’ + nwui ‘sister’) 
son-pal “hand and foot’ (son ‘hand’ + pal ‘foot’) 


(Sohn 1994: 416-7) 


Coordinative compounds are widespread in the world’s languages, but 
they happen to be rare in European languages, including English. 

Another, more familiar type of compound is represented by examples 
such as (7.11) from Spanish, where both compound members have the same 
referent. Such compounds are called appositional compounds. 


(7.11) poeta-pintor “poet who is also a painter’ 
actor-bailarin ‘actor who is also a dancer’ 
compositor-director ‘composer who is also a director’ 


English also has some compounds of this kind (student worker, Marxism- 
Leninism), and adjective compounds such as bitter-sweet and deaf-mute can 
be subsumed under this type as well. 

The last type of compound to be mentioned here is again exocentric, but 
it shares with coordinative compounds the feature of semantic equality of 
both compound members. A few examples from Classical Tibetan are given 
in (7.12). 


(7.12) rgan-gzon ‘age’ (rgan ‘old’ + gZon ‘young’) 
yag-fies ‘quality’ (yag ‘good’ + fies 'bad") 
mtho-dman ‘height’ —(mtho ‘high’ + dman low’) 
srab-mthug ‘density’ (srab ‘thin’ + mthug ‘thick’) 
(Beyer 1992: 105) 
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The semantic head of these compounds is something like ‘property’, so 
rgan-gzon is literally ‘property (in the feature) of old and young’, i.e. ‘age’. 


7.2 Hierarchical structure in compounds 


As we saw in the preceding section, the concept ‘semantic head’ is useful 
for talking about the kinds of semantic relations that may obtain between 
the members of a compound. In this section, we will see that not only a 
semantic notion of ‘head’, but also a formal notion of ‘head’ can play a role 
in morphology. Let us look at a number of examples of compounds and 
their tree diagrams. 


Neu Naser. Vas pres Nivasc.WeaK 
| l Mi ji | Vase. puts Nrew Nuascwenk 
lip sticks aiios luz baby sits Luft pirat 
English lipsticks Spanish años luz English babysits German Luftpirat 
"light years’ 'air pirate, hijacker’ 


Figure 7.1 Compound trees: two compound members 


Tree diagrams indicate hierarchical structure. In Figure 7.1 this is rather 
unexciting, because the compounds consist of only two lexemes. However, 
tree representations of compounds are particularly useful when a compound 
consists of members that are compounds themselves, because in that case 
several different hierarchical structures are possible. Two possibilities for 
three-term compounds are shown in Figure 7.2, and Figure 7.3 shows two 
possibilities for compounds with four terms. In the compound Berkeley 
Linguistics Society, for instance, the tree diagram shows that the second 
and third lexemes form a compound inside the larger structure. This 
hierarchical structure corresponds to the semantics: the Berkeley Linguistics 
Society denotes a kind of linguistics society, namely one established at the 
University of California, Berkeley. By contrast, in particle physics conference, 
the first two lexemes are grouped into a compound because it is a conference 
about particle physics. 
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Berkele Linguistics Society particle physics conference 
y 


Figure 7.2 Compound trees: three compound members 


N N N N 


Indiana University Linguistics Club University Cashiers Office employee 


Figure 7.3 Compound trees: four compound members 


Sometimes a compound with more than two nouns may allow two 
hierarchical structures simultaneously. For example, a compound like 
nuclear power station can be grouped as [[nuclear power][station]] or as 
[[nuclear][power station]] with equal justification, because both make sense 
semantically, and both the compounds nuclear power and power station exist 
in English. 

Tree diagrams can also be used to indicate the formal head of a compound. 
The notion of a formal head is mostly relevant to the endocentric type, 
for which the formal head and the semantic head coincide. In the trees in 
Figures 7.1 to 7.3, the formal head of each compound is symbolized by a 
double line connecting the head and the next higher node in the tree. This 
is straightforward for compounds with two members, as in Figure 7.1, but 
note that in larger compounds, there is a formal head for each component 
compound, e.g., conference is the formal head of particle physics conference, 
but physics is the head of the compound particle physics. 

We can identify at least two characteristics of formal heads in endocentric 
compounds. First, we see that in the noun lipsticks in Figure 7.1, sticks 
contains the number marking that characterizes the whole compound. 
The formal head of a compound is thus the morphosyntactic locus of the 
compound, in that it is the place where the morphosyntactic features of 
the compound are expressed. 
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An astute reader might object to analyzing lipsticks as [[lip]|stick-s]], in 
which plurality is a property of the head. An alternative proposal is [[lip] 
[stick]]s, where the plural suffix attaches to the complete compound word 
rather than to the head. This alternative works for this particular example, 
but attachment to the complete compound does not work for a case like 
Spanish años luz ‘light years’ (singular: año luz ‘light year’). As we saw 
in (7.6), Spanish has left-headed N + N compounds, and if plurality is a 
property of the entire compound, not the head, we expect afio luz-es, rather 
than the correct Spanish form afíos luz. At least in Spanish, then, the locus 
of plurality is the morphological head, and it is simpler to make the same 
analysis in ambiguous cases like lipsticks as well. 

Second, the formal head determines for the entire compound 
characteristics such as word-class, gender and inflection class. For instance, 
the English word babysits is a verb, just like its head sits, but unlike the 
nonhead baby. In Spanish afíos luz and German Luftpirat ‘air pirate, hijacker’, 
the nonheads luz ‘light’ and Luft ‘air’ are feminine, but the compound 
nouns are masculine, just like their heads. And in German Luftpirat we also 
see that the inflection class of the head is shared by the compound: both 
Pirat and Luftpirat are ‘weak’ nouns - i.e. their genitive singular suffix is -en 
rather than the more common -s. This can also be illustrated from English: 
the plural of church mouse is church mice, not *church mouses — i.e. the head 
determines the way the plural of the compound is formed. 

As we would expect, compounds that are not semantically endocentric 
do not necessarily behave like formally headed compounds. Thus, in 
coordinative compounds we often find double plural marking (e.g. Spanish 
actores-bailarines, compositores-directores, etc.). Also, the English exocentric 
compound sabretooth ('a tiger whose teeth are like sabres', not 'a tooth that 
is like a sabre") forms the plural sabretooths (not sabreteeth), and switchfoot 
(‘a surfer who can ride with either the right or left foot forward’) forms 
the plural switchfoots. (But by contrast, Blackfoot ('a person belonging to the 
Blackfoot Native American tribe’) most commonly has the plural Blackfeet, 
showing that there is some variation in this regard.) 


7.3 Hierarchical structure in derived lexemes 


Complex lexemes formed by derivational affixes are not unlike compounds 
in several respects, and many morphologists use tree representations 
to show the relations between the base and affixes. As with compounds, 
hierarchical tree structures are capable of showing semantic relations in 
derived lexemes in a salient way. For example, the two trees in Figure 7.4 
distinguish the two different meanings of undoable very clearly. Undoable1 
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(‘cannot be done’) is derived from doable with the negative prefix un-, and 
undoable2 (‘can be undone’) is derived from undo with the suffix -able. The 
tree structures in Figure 7.4 show these derivational origins quite directly. 


undoablei undoables 
PE 
A Y Suf 
‘i | ji T | 
un do able un do able 
‘which cannot be done’ ‘which can be undone’ 


Figure 7.4 Two meanings and two structures of undoable 


Sometimes different orderings of affixes yield significantly different 
meanings, and then hierarchical structure can be posited as well. Consider 
(7.13) from Capanahua. 


(7.13) a. pi-catsih-ma-hue 
eat-DESID-CAUS-IMPV 
‘Make him hungry.’ (Lit.: ‘Make him want to eat.’) 


b. pi-ma-catsihqu-i 
eat-CAUS-DESID-PRS 
‘He wants to feed it.’ (Lit.: ‘He wants to make it eat.) 
(Payne 1990: 228; data from Eugene Loos) 


Both of these example words contain desiderative (‘want to do something’) 
and causative (‘make someone do something’) suffixes, yet they have different 
meanings (even setting aside differences represented by the imperative and 
present tense suffixes). We can posit that the difference of meaning reflects 
a difference of internal structure. In particular, in the first example -ma 
[caus] attaches to the base pi-catsih- [eat-bEsip], resulting in the hierarchical 
structure [[pi-catsih]-ma]. Semantically, this means that -ma does not modify 
the root pi- 'eat', but rather the entire base, meaning ^want to eat'. The result 
is the meaning ‘make want to eat.’ (Linguists say that -ma has semantic scope 
over picatsih-.) Likewise, in the (b) example, -catsihqu [DEsID] attaches to and 
has semantic scope over the base pi-ma- [eat-CAus] ‘make eat’, resulting in the 
meaning ^want to make eat'. The different orderings are therefore associated 
with different semantic scope, so two very different readings arise. This is just 
like syntax, and a tree-like representation as in syntax captures the properties 
of these affixes quite well. 
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In addition to showing semantic relations, tree representations have 
also been used in morphology for expressing certain formal properties 
of derived lexemes. Some examples of such representations are given in 
Figure 7.5. 


English readable Russian Polish English delouse 
car-stvo awans-owac 
'czardom' ‘promote’ 
A Nur V Y 
y no N net N yt ypref N 
| MES ME MEN I | | 
read -able car’ -stvo awans -ować de- louse 


Figure 7.5 Tree representations of derived lexemes 


The English suffix -able is mostly attached to verbs (and occasionally to 
nouns, as in fashionable), turning them into adjectives. As we saw in Chapter 
5, it is quite typical of derivational affixes that they change the word-class 
of their base lexeme. This can be expressed by saying that the derivational 
affixes belong to a word-class (noun, verb, adjective) just like full lexemes 
and stems, and that they may be the heads of the corresponding derived 
lexemes. Since the formal head determines word-class (as we saw for 
compounds in the preceding section), the word-class of the resulting lexeme 
is that of the derivational affix. Thus, read-able is an adjective because -able 
belongs to the word-class of adjectives, Russian carstvo 'czardom' is a noun 
because of —stvon, and Polish awans-owaé ‘promote’ is a verb because of 
-owacv. And as in compounds, derivational affixes also typically determine 
the gender of nouns (as is illustrated by Russian -stvo, which derives neuter 
nouns) and the inflection class of the derived lexeme (-stvo derives nouns of 
the o-declension, and Polish -owaé derives verbs of the -owa/-uj conjugation). 

Not all derivational affixes are heads. Many derivational affixes do not 
determine the word-class and other properties of their derived lexemes. In 
the European languages, this is true in particular of prefixes and diminutive 
suffixes. Three such non-head affixes from three languages are listed in 
(7.14). 


(7.14) English co- Spanish pre- Italian -ino 
N co-author pre-historia ‘prehistory’ tavol-ino ‘little table’ 
A co-extensive  pre-bélico ‘pre-war’ giall-ino ‘yellowish’ 
V  co-exist pre-ver ‘foresee’ (Adv) ben-ino ‘rather well’ 


7.4 PARALLELS BETWEEN SYNTAX AND MORPHOLOGY? 147 


However, derivational affixes often behave like heads of compounds, and 
this may be regarded as a sufficient reason for treating them as heads as in 
Figure 7.5. 


7.4 Parallels between syntax and morphology? 


In this book we have already looked at the architecture of the language 
system in terms of whether inflection and derivation are split between 
two different components of the grammar (the so-called split-morphology 
hypothesis; see Section 5.5). However, there is another, equally important 
question about the architecture of the system: Is morphology formally 
distinct from syntax? In other words, do the same principles apply to both 
word formation and sentence formation? There is a variety of evidence 
that bears on this question, but in this chapter we restrict ourselves to 
the evidence that is related to formal heads. (Other considerations will be 
added in Chapter 9 in the context of something called the Lexical Integrity 
Principle. See also the Further reading section of this chapter.) 

Some morphologists argue that compounds and derived lexemes 
have internal hierarchical structure and formal heads because syntactic 
structure is hierarchical and consists of heads and dependents. In other 
words, syntax and morphology are similar because syntactic principles 
govern word-internal structure. Others posit that compounding and 
derivation operate largely independently of syntax. In this perspective, 
it is possible for morphological structure to be similar to syntactic 
structure, but this need not be true. Empirically, then, strong parallels 
between (hierarchical) morphological structure and syntactic structure 
would suggest a language architecture in which there is no distinct 
morphological component, since this provides a more natural explanation 
for similarities. Strong differences would be inconsistent with a syntactic 
approach to morphology, and support an architecture in which there are 
distinct morphological and syntactic components. The question to be 
answered, then, is: To what extent are morphological heads and syntactic 
heads similar? 

Endocentric compounds and syntactic phrases share a semantic trait — in 
both cases, the dependent member narrows the meaning of the head. For 
instance, in the compound doghouse the dependent member dog specifies 
a particular kind of house. And in the syntactic phrase house for a dog, the 
dependent for a dog has the same kind of semantic relationship to the head 
noun house. In this limited sense, then, heads in endocentric compounds 
and in syntactic phrases are similar. 

However, in syntax, there are also three other purely formal properties 
that heads share: 
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(7.16) Syntactic head properties 
a. The head is the morphosyntactic locus, i.e. it bears inflectional 
markers that belong to the whole phrase. 
b. The head may govern the form of its dependents. 
c. Adependent may agree in person/number with its head. 


As introduced in Section 5.3.1, government is a kind of syntactic relation 
in which one word requires another word or phrase to have a particular 
inflectional value. The latter is thus dependent upon the former for its 
properties. Similarly, in agreement, the inflectional value of a dependent 
word or phrase must be the same as the inflectional value of another word 
or phrase in the sentence. 

These properties can be illustrated with the sentence in (7.17) (a Russian 
example is chosen because the inflectional properties are less salient in 
English). In Figure 7.6, a tree diagram for this sentence is given. 


(7.17) Student-y | pomaga-l-i — zavedujusc-ej kafedr-oj. 
student-PL help-pst-pL chairwoman-pATr department-INs 
‘The students helped the chairwoman of the department.’ 


ESI 


NP hiom.. Vnispt NPourse 
Nyom.. Nourse sie 
bis 
Student-y pomaga-L-i zavedujusc-ej kafedr-oj 
student-PL help-PAST-PL chairwoman-DAT department-INSTR 


Figure 7.6 A tree diagram for (7.17) 


In the tree in Figure 7.6, the head of each phrase is symbolized by a double 
line between the head and the next higher node in the tree, called the 
phrasal node. We see that the verb pomagali is the morphosyntactic locus of 
the sentence; it bears the tense marking that characterizes the whole clause. 
Likewise, nouns are the morphosyntactic locus of their noun phrases (NPs); 
they bear the case and number markers that ultimately belong to the NP. 
We also see two examples of government: The verb pomagali requires its 
dependent object NP to have dative case, and the noun zavedujuscej requires 
its dependent complement NP to have instrumental case. Finally, pomagali 
agrees with the subject NP in number. 
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Of these three syntactic head properties, only morphosyntactic locus 
applies to compounds as it does to syntactic phrases. The remaining two 
syntactic head properties cannot be observed because the dependent in 
compounds does not in general bear inflectional features. As we saw in 
Section 7.1, the dependent member in compounds is an uninflected stem 
whose inflectional form cannot be governed and which cannot be the target 
of agreement. 

Moreover, similarities between syntactic heads and derivational heads 
are even weaker. The semantic criterion does not apply here, for obvious 
reasons: reality is not a kind of -ity, something that is yellowish is not kind of 
-ish, and so on. So the similarity to the syntactic notion of head is tenuous, 
and many morphologists have expressed scepticism about the usefulness of 
carrying over this notion to affixes. It must also be kept in mind that not all 
derivational patterns involve affixes. It may be possible to describe English 
carri-er as a headed structure, but Arabic hammaal ‘carrier’ (from fiamala 
‘carry’) cannot be so described. So for derivational patterns that involve 
base modification or other non-concatenative operations, it is not clear that 
even morphosyntactic locus is relevant. 


Head characteristic Syntactic Compound Derivational 
Dependent narrows meaning * + - 
Morphosyntactic locus + + (+) 
Government + - - 
Agreement * = = 


Table 7.1 Properties of formal heads in syntax, compounding and derivation 


In short, hierarchical structure at the word level is superficially similar 
to hierarchical structure at the level of the sentence, but parallels between 
syntactic and formal morphological heads are only partial for compounds, 
and even less strong for derivation. Clearly, there might be some underlying 
principle of language that causes both syntax and compounding (and 
maybe derivation) to exhibit hierarchical structure and head properties, but 
the principles governing syntactic structure and the principles governing 
morphological structure are not identical. The degree of formal separation 
between morphology and syntax is a topic of ongoing debate. 


150 CHAPTER 7 MORPHOLOGICAL TREES 


Summary of Chapter 7 


There are many types of nominal compounds: endocentric 
compounds, exocentric compounds, and various kinds of 
compounds with more than one semantic head (e.g. coordinative 
compounds and appositional compounds). Like syntactic phrases, 
(endocentric) compounds are often conveniently described as having 
hierarchical structure and formal heads. These are represented using 
tree diagrams. Such hierarchical structures are often also applied to 
derived lexemes, and derivational suffixes are often described as the 
heads of their words. However, derivational suffixes share only a few 
properties with the heads of syntactic phrases, and even compounds 
do not exhibit the syntactic head properties of government and 
agreement. Parallels between syntactic and morphological structure 
are thus not particularly strong, indicating that principles of word 
formation are, at least to some degree, distinct from principles of 
sentence formation. 


Further reading 


For a cross-linguistic survey of compounding, see Bauer (2001a) and 
Guevara and Scalise (2009). For noun incorporation, see Mithun (1984). For 
coordinative compounds, see Olsen (2001). Investigation of compounds 
from psycholinguistic and neurolinguistic perspectives has only recently 
begun in earnest, but see Libben and Jarema (2006). 

The approach that uses hierarchical structures is most prominently 
represented by works such as Selkirk (1982), Di Sciullo and Williams (1987), 
Lieber (1992) and Embick and Noyer (2007). On heads in morphology, see 
in particular Williams (1981a), Scalise (1988b) and Haspelmath (1992), and, 
for some sceptical voices, see Reis (1983) and Bauer (1990). A recent defence 
is found in Stekauer (2000). 

Regarding the broader argument of whether the morphological 
component can be collapsed into the syntactic one, classic arguments 
in favour are Baker (1985) and Pesetsky (1985). Specific attacks on the 
syntactic approach to morphological structure include Spencer (1997) 
and Smirniotopoulos and Joseph (1998). The typological evidence for 
compounding as a syntactic process is assessed in Sadock (1998) and Baker 
(1998), to opposite conclusions. 
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Comprehension exercises 


1. 


Draw tree diagrams (analogous to those in Figures 7.1-7.3) for the 
following English compounds: 

family planning adviser, undersea cable repair team, fixed-line phone system, 
mad cow disease hysteria, World Trade Center rescue worker, credit card 
agreement form, major league baseball game 


(For some of these, two different solutions may be correct.) 


Consider the following Sanskrit compound stems and, judging by their 
meaning, determine the type of compound in each case. (Note that 
in Sanskrit, morphophonological alternations often slightly alter the 
shape of roots at morpheme boundaries.) 


asvakovida- ‘knowledgeable about horses’ 
bahuvrihi- ‘having a lot of rice’ 

divyarüpa- ‘having divine shape’ 

grhapati- ‘houseowner’ 

maharaja- ‘great king’ 

mahatman- ‘having a big soul, i.e. magnanimous’ 
priyasakhi- ‘dear friend’ 

rajarsi- ‘poet who is also royalty’ 

Suklakrsna- ‘bright and dark’ 


sukhaduhkha- ‘joy and pain’ 


A list of relevant Sanskrit nominal and adjectival roots: 


atman- ‘soul’ grha- ‘house’ 

asva- ‘horse’ kovida- ‘knowledgeable’ 
bahu- ‘much’ krsna- ‘dark’ 

divya- ‘divine’ mahat- ‘big’ 

duhkha- ‘pain’ pati- ‘lord’ 

priya- ‘dear’ Sukla- ‘bright’ 

rsi- ‘seer, poet’ sakhī- ‘female friend’ 
rajan- ‘king’ sukha- ‘joy’ 

rüpa- ‘shape’ orthi- ‘rice’ 


In Spanish, there are two homophonous adjectives inmévilizable: 
inmóoilizable] ‘unmobilizable’ and inmovilizable2 ‘immobilizable’. The 
morphological structure of these words corresponds closely to the 
structure of the corresponding English words (prefix in- *un-', suffix 
-able '-able', suffix -iz ‘-ize’, móvil ‘mobile’). Draw the constituent 
structure trees of these two words. 


Russian has a productive class of exocentric A + N compounds 
comparable to the Ancient Greek compounds in (7.7): 
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dlinnorukij ‘long-armed’ dlinnyj ‘long’ ruka ‘arm’ 
krasnoborodyj ‘red-bearded’ ^ krasnyj — ‘red’ boroda ‘beard’ 
černokožij 'black-skinned' černyj ‘black’ koža ‘skin’ 
tolstonogij 'thick-legged' tolstyj ‘thick’ noga ‘leg’ 


Formulate the word-based rule for these compounds. 


(Note: -yj and -ij are phonological allomorphs. Russian consonants can 
be phonologically ‘hard’ or ‘soft’. The adjectival allomorph -yj is used 
when the preceding consonant is hard, and -ij when it is soft. Since the 
difference results from phonology, the conditions for -yj vs. -ij do not 
need to be captured by your rule.) 


Exploratory exercise 


In this chapter, we saw that compounds and maybe derived lexemes can be 
described as having hierarchical structure. In some respects, word structure 
thus has much in common with sentence structure. We have also asked 
whether these similarities are sufficient to posit that the same or closely 
related principles govern both syntax and morphology. In this task, you will 
delve deeper into the parallels (or lack thereof) between sentence structure 
and word structure by conducting a typological survey of word order and 
the order of affixes and stems. 

Typologists have demonstrated that among the world’s languages, some 
grammatical structures systematically co-occur with others. For example, 
dominant word order is closely connected to the order of prepositions/ 
postpositions relative to nouns. (A postposition is like a preposition, except 
that it comes after the noun (post-), rather than before it (pre-). They are 
collectively referred to as adpositions.) In some languages, the dominant 
order for a sentence with both a subject and an object is subject-verb- 
object (abbreviated SVO). English is such a language: John (=subject) read 
(=verb) the letter (=object). In other languages, both the subject and the 
object normally come before the verb (SOV order). Japanese is this kind of 
language, as shown in (7.18). Other languages may have yet other dominant 
word orders. 


(7.18) John ga tegami o yon-da 
John spy letter OBJ read-Psr 
‘John read the letter.’ 


Interestingly, languages with SVO order are very likely to have 
prepositions, whereas languages with SOV order are very likely to 
have postpositions. Table 7.2 shows the number of languages with each 
combination of word order and adposition type, based on a sample of 750 
languages. 
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Basic Order Prep Post 
SOV 8 328 
SVO 278 29 
VSO 69 6 
VOS 20 0 
OVS 2 7 
OSV 0 3 


Table 7.2 Correlation between dominant word order and prepositions vs. postpositions 
(Dryer 2008a, b) 


Linguists thus say that the order of subjects and objects relative to verbs 
and the order of adpositions relative to nouns exhibit an implicational 
relationship. Knowing the dominant word order of a language greatly 
increases the chance of correctly predicting whether adpositions come 
before or after nouns. (Of course, implicational relationships can have 
exceptions. Finnish and Estonian, for example, are predominantly SVO 
languages, but they have postpositions. Implicational relationships thus 
represent likely co-occurrence rather than an absolute correlation.) 

Linguists are interested in this implicational relationship because it 
suggests that the generalization governing the order of subjects/objects 
and verbs may be closely related to the generalization governing the order 
of adpositions and nouns. In particular, the generalization seems to be that 
most languages are either predominantly head-initial or predominantly 
head-final, meaning that syntactic heads occur at the beginning or end of 
phrases, respectively. Adpositions are the heads of their phrases, and verbs 
are the heads of VPs. Thus, if a language has both SOV word order and 
postpositions, it is predominantly head-final. 

The purpose of this research exercise is to determine whether there is 
an implicational relationship between word order and the order of affixes 
relative to stems, and to explore the importance of this typological data 
for the following research question: Are the principles governing word 
order and morpheme order closely related? If sentence structure and word 
structure follow the same basic principles, we might expect an implicational 
relationship to hold between them. 


Instructions 


Step 1: Develop hypotheses and predictions. Remember that a hypothesis is 
a formal guess about the relationship between syntactic and morphological 
rules (i.e. are they different?). A prediction is the result that you expect 
in your data, based on your hypothesis. So, for instance, if a language is 
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syntactically head-initial, do you expect to find that morphological heads 
(i.e. affixes) will also precede their dependents (stems)? What would be 
predicted by a theory in which morphological and syntactic rules are not 
distinct? By a theory in which they are distinct? 

Step 2: Collect data using World Atlas of Language Structures (WALS). The 
atlas is available both as a book accompanied by CD-ROM, and also via the 
web (http:/ / wals.info). Identify languages for which inflection is primarily 
expressed with prefixes or primarily with suffixes (Dryer 2008c). Then look 
at word order for these same languages (Dryer 2008a, d). Organize your 
data - including language family and where in the world the language is 
spoken - as shown in Table 7.3. Obviously, you will need a much bigger 
sample of languages than is shown in the table. 


Language Language family Where Affixes Basic Order 
spoken 
Paiute Uto-Aztecan, N. America Mostly SOV 
(Northern) Numic suffixes 
Evenki Altaic, Tungusic C. Asia Mostly SOV 
suffixes 
Arrernte Australian, Pama- Australia Mostly SOV 
(Mparntwe) Nyungan suffixes 
Albanian Indo-European, E. Europe Mostly SVO 
Albanian suffixes 
Kikuyu Niger-Congo, E. Africa Mostly | SVO 
Benue-Congo prefixes 
Mixtec Oto-Manguean, C. America Mostly VSO 
(Chalcatongo) Mixtecan prefixes 


Table 7.3 A sample of languages according to inflectional prefixation vs. suffixation 


and word order 


Give some thought to how you select the languages in your sample. Do 


you want to pick randomly from among those for which information is 
available? Or do you want your sample to be representative according to 
some criterion (e.g. geographically representative, or evenly distributed 
among language families)? Try to anticipate how your choice of methodology 
might affect your results. 

Hint: The interactive tool available as part of the CD-ROM and web 
versions of WALS is the most efficient way to look for correlations, because it 
allows the combination of two features to be mapped (called the ‘Composer’ 
feature on the CD-ROM and ‘Feature Combination’ in the web version). 
The book and web versions contain articles with more detailed information 
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about individual features; these articles are useful for understanding how 
the data set was constructed, and for overviews of typological patterns. 

Step 3: Analyze the data. Was your prediction upheld? Remember to 
consider any issues, including the following: 

1) Is there is an implicational relationship between the order of words 
and the order of stems and affixes? If so, how strong is it? 

2) Is any implicational relationship of the type that you predicted? For 
example, if you predicted that heads and dependents are in the same order 
at both the sentence level and the word level, is this the pattern found in 
your data? 

3) Do some patterns occur predominantly in one language family, or in 
one area of the world? If so, is this what you would expect, given your 
hypothesis? What explanations could there be for grouping? 

4) This chapter was primarily about hierarchical structure in compounds 
and derivational affixes, but in this study affix-stem order refers to 
inflectional affixes. How important is this discrepancy? Or stated differently, 
do morphologists ever claim that inflectional affixes are heads, and does this 
change your analysis/conclusions at all? (A good answer to this question 
will require you to read more about heads in morphology. See the Further 
reading section.) 

5) Finally, also consider the possible impact of your data collection 
methodology on your results. 

Step 4: Draw conclusions. Consider whether your data supports only one 
answer to the research question, or whether multiple answers are consistent 
with your data. 

Extension: Explore other kinds of word orders (e.g. the order of 
adpositions and noun phrases, the order of adjectives and nouns, and so 
on), and possible correlations between these and the order of affixes and 
stems. 


Inflectional 
paradigms 


8.1 Syntagmatic and paradigmatic relations in 
morphology 


The relations between linguistic units are of two broad kinds: syntagmatic 
relations between units that (potentially) follow each other in speech, and 
paradigmatic relations between units that (potentially) occur in the same 
slot. In other words, syntagmatic relations have to do with items ordered 
one after the other, while paradigmatic relations have to do with items 
that stand in contrast to one another. We can think about syntagmatic and 
paradigmatic relations at the sentence level, for instance, in (8.1), where 
the horizontal dimension shows syntagmatically related units, and the 
vertical dimension shows paradigmatically related units. Parentheses show 
optionally occurring linguistic units, curly brackets show choices among 
units, and asterisks show impossible units. 


(8.1) In |the| beginning | God created | the | heaven | (and the earth) (*not). 
*O Allah \ | made heavens 
he *create 


"why | “rested 


Morphology can likewise be looked at from both a syntagmatic and a 
paradigmatic point of view. Bases are syntagmatically related to affixes 
that attach to them, whereas word-forms belonging to the same lexeme 
are paradigmatically related because they form a set of contrasting 
instantiations of the lexeme (to take a simple example, the English word bag 
is identifiable as having singular number exactly because it contrasts with 
the plural form bags, and because singular and plural forms generally form 
such a contrast in English). 
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(8.2) bag [-O 
-$ 
*-en| (asin children) 


Now, in developing a description of language architecture, we might 
ask whether we need formal mechanisms that encode both syntagmatic 
and paradigmatic dimensions of structure. To be sure, some linguists have 
posited models that describe morphological patterns in purely syntagmatic 
terms. The formalism in (8.3) (which we have used in this book to represent 
the morpheme-based model; see (3.22)), represents bags as the linear 
combination of the morphemes bag and -s. 


(8.3) bag -S 
/bæg/ /z/ 
N + sN— — bags 
‘bag’ ‘plural’ 


Crucially, (8.3) does not include the paradigmatic perspective at all, which 
is to say that there are no direct relations between word-forms belonging to 
the same lexeme. (Bag is the root, not a word-form with singular number. 
Formally, it is purely coincidental that the root has the same form as the 
singular.) 

The same pattern may also be described with emphasis on the 
paradigmatic dimension. Example (8.4), our standard formalism for the 
word-based model, also represents the syntagmatic aspect of the structure 
(the order in which the stem and plural marker appear in the word), but 
additionally it draws a direct relationship between singular and plural 
forms. 

(8.5)! [/X/N 
i 
NUMBER: SING 


T 
/Xz/N 


"A 


x 
NUMBER: PLUR 


So it is clear that morphological rules need to capture the syntagmatic 
dimension since affixes occur in particular positions relative to their bases, 
but the real question is whether we need to also incorporate paradigmatic 
rules into our formal description of language structure. Logically, if our 
formal description could make do with only syntagmatic description and 


1 This correspondence is set up vertically to emphasize that this is a paradigmatic relation 
and to be consistent in this respect with (8.1)-(8.2). But of course, the spatial layout is just a 
convenience. We will return to the normal, space-saving horizontal representation below. 
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still be empirically adequate, then this should be simpler, and therefore 
preferable. 

The answer depends on whether there are phenomena in language that 
can be adequately described only as a direct relationship between inflected 
forms. If so, then morphological structure would seem to include both 
syntagmatic and paradigmatic dimensions. If not, then a purely syntagmatic 
model would be sufficient. In this chapter, we show that some inflectional 
patterns do indeed seem to require a paradigmatic approach, indicating 
that paradigmatic relations are part of the architecture of the (inflectional) 
morphological system. But to make this discussion possible, we must first 
introduce the idea of the inflection class. 


8.2 Inflection dasses 


Perhaps the most important challenge for an insightful description of 
inflection is the widespread existence of allomorphy in many languages. 
Phonological and morphophonological allomorphy will be the topic of 
Chapter 10; in this section we focus on suppletive allomorphy. We saw 
some examples of suppletive inflectional affixes in Section 2.3, and two 
more are given in (8.5)-(8.6). 


(8.5) Irish 
NOM.SG.  GEN.PL 
focal focail ‘word’ 
muc muic-e ‘pig’ 
corón corón-üch ‘crown’ 


(8.6) Old English 


INFINITIVE 3RD SG PRESENT 3RD SG PAST 
dem-an ‘to deem’ dem-d ‘deemeth’ dem-de ‘deemed’ 
luf-ian ‘to love’ luf-ad ‘loveth’ luf-ode ‘loved’ 


In (8.5), for instance, all three Irish words show zero expression in the 
nominative singular, but in the genitive plural each word has a different 
inflectional marker: zero, -e or -ach. The genitive plural thus exhibits 
suppletion.? 

When different lexemes show different suppletive inflectional 
allomorphs, morphologists say that they belong to different inflection 


? Remember from Section 2.3 that some linguists use the term suppletion only to refer to 
non-phonological allomorphy in stems. Under such a definition, we would say that 
the words in (8.5) and (8.6) exhibit different affixal morphemes (rather than different 
allomorphs). This is primarily a terminological issue, however, and it has no real impact 
on how inflection classes are defined. 
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classes? An inflection class is thus the set of paradigms that exhibit the 
same inflectional pattern. Inflection classes may be very large and contain 
hundreds or thousands of lexemes, or they may be small and contain only a 
handful of lexemes. The limiting case would be an inflection class with just 
a single lexeme (for most purposes, this would amount to saying that the 
inflection of that lexeme is irregular). 

Languages differ in the number of inflection classes that they exhibit. 
In (8.5), three different nominal inflection classes (or declensions) are 
illustrated, and, in (8.6), two verbal inflection classes (or conjugations) are 
shown. The existence of different inflection classes is a hallmark of Indo- 
European languages, so many examples in this chapter will come from the 
Indo-European language family. Of course, the phenomenon is not restricted 
to Indo-European, but there are many languages with fairly complex 
morphological systems in which suppletion of this kind is not found or is 
at least much less prominent (for instance, Turkish, Korean, Quechua and 
Tamil). Thus, not all languages have multiple inflection classes. 

In inflection classes, the various suppletive allomorphs are grouped into 
sets. This can be seen by looking at the complete paradigms of the two Latin 
words: 


(8.7) o-declension | u-declension 

SG NOM _hort-us grad-us 
acc hort-um grad-um 
GEN — hort-i grad-iis 
DAT hort-ó grad-ut 
ABL hort-o grad-ü 

PL NOM  hort-i grad-ts 
ACC hort-üs grad-ts 
GEN __hort-orum grad-uum 
DAT hort-is grad-ibus 
ABL hort-is grad-ibus 


We can say that a Latin noun in -us (like hortus ‘garden’, gradus ‘step’) has 
a genitive plural in -orum if its genitive singular is -7, and a genitive plural 
in -uum if its genitive singular is -üs. If the distribution were arbitrary, we 
might expect that some nouns in Latin would have the genitive singular 
-i, the ablative singular -i, the accusative plural -ds, and the dative plural 
-ibus, for instance. But, in fact, a noun can only choose a complete package 
of suffixes, either the package of hortus (generally called the o-declension) 
or the package of gradus (generally called the u-declension). Thus, one 
form can be used to predict another. Of course, in practice some of these 
dependencies may be more useful than others. For example, the nominative 
and accusative singular are identical in both classes and therefore have little 


° The term inflection class is not generally used for phonological allomorphy. 
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or no predictive value. Less obviously, learners of Latin probably hear the 
genitive singular of a new word more often than its genitive plural, so the 
ability to predict the genitive plural from the genitive singular is probably 
more relevant than the ability to make the reverse prediction. 


8.2.1 Inflection dass assignment 


Words are assigned to inflection classes based on phonological, semantic 
or morphological criteria. Some examples of phonological class assignment 
are given in (8.8). 


(8.8) a. Lezgian aorist participle: -j(i) after low vowel (a, e), -r after high 
vowel (u, ii, i) 


AORIST FINITE awu-na t'ü-na fe-na ata-na 

AORIST PARTICIPLE = dwul-r t'ü-r fe-ji ata-j 
‘did/ ‘ate/ ^went/ 'came/ 
done’ eaten’ gone’ come’ 


(Haspelmath 1993: 131) 


b. Eastern Armenian plural: -er with monosyllabic bases, -ner with 


polysyllabic bases 
sc jerk yug erexa tari 
PL  jefk-er yug-er erexa-ner tari-ner 


'hand(s)  'oil(s) 'child(ren) 'oil(s)' 


c. Standard Arabic plural: If the singular has the phonological 
shape CVCCVC, then the plural has the form CaCaaCiC. If the 
singular is CVCCVVC, the plural is CaCaaCiiC. 

SG  qaysar daftar dirham dustuur — quftaan 
PL  qayaasir dafaatir daraahim ^ dasaatür — qafaatiin 
‘emperor’ ‘notebook’ ‘drachma’ ‘statute’  'caftan' 


Among semantic criteria, animacy distinctions are particularly 
widespread. In German, only animate nouns belong to the masculine 
n-declension ending in -e in the nominative singular (Hase ‘hare’, Affe ‘ape’, 
Junge ‘boy’). In Tamil, the locative suffix is -il with non-human nouns (e.g. 
natt-il ‘in the country’), but -itam with human nouns (e.g. manitan-itam ‘in the 
man’) (Annamalai and Steever 1998: 105). Welsh has a special plural suffix 
for nouns denoting animals, -od (e.g. cath/cathod 'cat(s)', draenog/draenogod 
‘hedgehog(s)’, eliffant/eliffantod ‘elephant(s)’) (King 1993: 59). Lezgian has 
a special oblique-stem marker that is used with all consonant-final proper 
names, -a (e.g. Farid-a ‘Farid’, Talibov-a ‘Talibov’). Lezgian also illustrates the 
potential relevance of the mass-count distinction: mass nouns tend to have 
the oblique-stem suffix -adi/-edi (e.g. naq’w-adi ‘soil’, kf-adi ‘foam’, hiim-edi 
‘haze’) (Haspelmath 1993: 75-6). In verbs, transitivity often plays a role. For 
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example, in Ossetic intransitive and transitive verbs show different agreement 
inflection in the past tense. The singular forms of the intransitive verb xuyssy- 
‘sleep’ and of the transitive verb dzur-/dzyr- ‘say’ are given in (8.9). 


(8.9) intransitive pattern transitive pattern 
1sG — xuyssy-d-an ‘I slept’ dzyr-d-on ‘I said’ 
2sG | xuyssy-d-c dzyr-d-aj 
38G  xuyssy-d dzyr-d-a 


(Isaev 1966: 247) 


When there is morphological assignment, most typical is that the 
derivational pattern of a derived lexeme determines its inflectional 
behaviour. For example, Welsh has about a dozen different plural patterns, 
which are often unpredictably associated with individual nouns. However, 
when a noun has a derivational suffix, it is mostly predictable which plural 
affix the noun takes: 


(8.10) Derivational Plural Base Derived form 

suffix suffix 

-0g (person) -ion  swydd ‘job’ swydd-og(-ion) ‘official(s)’ 
march ‘horse’ march-og(-ion) 'horseman/men" 

-es (female) -au tywysog'prince' tywysog-es(-au) ‘princess(es)’ 
Sais ‘Englishman’ Saesn-es(-au) — 'Englishwoman/ 

-women' 

-ur (agent) -iaid pechu ‘sin’ pechad-ur(-iaid) ‘sinner(s)’ 

cachu  'shit cachad-ur(-iaid) ‘coward(s)’ 


(King 1993: 53-61) 


Another example of derivational patterns determining inflectional 
behaviour comes from Tagalog. In this language most verbs have a 
derivational affix that indicates in some way the transitivity or voice of the 
verb (e.g. actor voice -um-, ma-, patient voice -in, -an). The perfective form 
of the verb can be formed in four different ways: (i) zero (when the voice 
affix is -um-); (ii) m- becomes n- (e.g. when the voice affix is ma-); (iii) infix 
-in- (e.g. when the voice affix is -an); and (iv) infix -in- and subtraction of -in 
(when the voice affix is -in): 


(8.11) root basic form with voice affix perfective form gloss 


takbo tumakbo tumakbo ‘run’ 
tulog matulog natulog ‘sleep’ 
hugas hugasan hinugasan ‘wash’ 
basah basahin binasah ‘read’ 


As we saw in Section 7.3, this is one of the reasons why some morphologists regard 
derivational affixes as heads of their lexemes — the derivational affix determines the 
inflectional pattern of the entire word-form. 
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Again, this illustrates the dependence of inflection-class membership 
on a morphological property of the lexeme (its derivational pattern). Of 
course, the derivational pattern need not be characterized by an affix. In 
Arabic, nouns derived by the pattern C1aaC2iC3 tend to have the plural 
C1uC2C2aaC3 (e.g. kaafir ‘infidel’, plural kuffaar; kaatib ‘writer’, plural 
kuttaab; zaahid ‘ascetic’, plural zuhhaad). 


8.2.2 Relationship to gender 


Inflection classes are often linked to gender, but this relationship can be 
complicated. For instance, in the Bantu languages (e.g. Zulu), a link between 
inflection class and gender is evident: the markers that reflect the gender on 
agreement targets are formally similar to the inflectional affixes on controller 
nouns. Zulu has the four inflection classes illustrated in (8.12), among others. 


(8.12) SG PREFIX PL PREFIX EXAMPLE MEANING AGR PREFIXES 
um- aba- umfazi/abafazi ‘woman/-men’ u-/ba- 
um- imi- umfula/imifula — 'river(s)' u-/i- 
i- ama- itafula/amatafula ‘table(s)’ li-/a- 
isi- izi- isicathulo/ 'shoe(s) si/zi- 
izicathulo 


The agreement prefixes for the genders corresponding to the four inflection 
classes are given in the last column in (8.12). Two examples of their use as 
subject prefixes on verbs are given in (8.13). 


(8.13) a. Aba-fazi ba-biza aba-fana 
PL.G2-woman  3PL.G2.sBJ-call Pr.c2-boy 
‘The women call the boys.’ 


b. Isi-hambi si-buza um-gwago. 
PL.G8-traveller 3PL.G8-ask SG.G3-road 
‘The traveller asks the road.’ 
(Ziervogel et al. 1981: 34, 46) 


The similarities between aba- and isi- as inflectional markers on the nouns, 
and ba- and si- as gender agreement markers on the verbs, are striking. 
Clearly, there is a close correspondence between the gender classes and 
inflection classes in Zulu. 

Nonetheless, we need to make a principled distinction between inflection 
class and gender. Consider the Italian examples (8.14)- (8.15). 


(8.14) Two Italian inflection classes 


SG SUFFIX PL SUFFIX EXAMPLE MEANING AGR SUFFIXES 
-0 -i giardino/giardini 'garden(s) -o/-i (masc.) 
-a -e casa/case ‘house’ -a/-e (fem.) 
-0 -i mano/mani ‘hand’ -a/-e (fem.) 


-a -i poeta/poeti ‘poet’ -0/-i (masc.) 
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(8.15) Italian gender agreement (adjectives agree with nouns) 
a. ilgiardin-o nuov-o ‘the new garden’ 


b. la cas-a nuov-a ‘the new house’ 
c. la man-o rugos-a ‘the wrinkled hand’ 
d. il poet-a mort-o ‘the dead poet’ 


In Italian we clearly need to distinguish between inflection classes and 
genders because there are nouns that have the singular suffix -o but are 
feminine (e.g. mano ‘hand’) and nouns that have the singular suffix -a but are 
masculine (e.g. poeta “poet’). Such nouns are much rarer than the opposite 
pattern, but they exist. Thus, we must make a principled distinction between 
a noun’s inflection class (which determines the set of inflected forms of the 
noun lexeme), and its gender (an agreement feature). 

Formally, semantic factors and the inflection class of a noun together 
determine the noun’s gender (Corbett 1982, 1991). For instance, most 
Russian nouns belonging to inflection class II (indicated by a nominative 
singular form ending in -a, e.g. kniga ‘book’) are feminine, but a small 
subset are masculine (sluga 'male servant. These latter nouns take 
the same endings as class II feminine nouns, but we know that they are 
masculine because they trigger masculine syntactic agreement (xoroSaja 
kniga 'good(fem.) book’; xoroSij sluga 'good(masc.) servant’). If gender 
determined inflection class, we would expect these words to fall into class I, 
which contains the vast majority of masculine nouns. However, if inflection 
class and semantic factors together determine gender, the observed pattern 
is easier to explain. Importantly, all of the nouns like sluga refer to sex- 
differentiable male beings. We must therefore simply assume that gender 
assignment in Russian is determined by natural sex when the word refers 
to a sex-differentiable animal or human, and by inflection class otherwise. 
(There are also languages in which gender is purely semantically based, or 
in which it depends on a combination of semantic and phonological criteria 
(Corbett 1991), but these are not directly relevant here.) 


8.2.3 Inflection classes and productivity 


Like word-formation patterns, inflection classes can differ in productivity. 
Specifically, they can differ in their ability to apply to novel lexemes that 
come into the language, either as loanwords or as neologisms formed by 
productive word-formation rules, and they can also differ in their ability to 
attract new members by inflection class shift. A class shift is a diachronic 
change by which a lexeme changes its inflection class. (We have already 
seen an example of inflection class shift in Section 6.4.3: fling/flinged > 
fling/flung.) It is convenient to identify three degrees of inflection-class 
productivity on the basis of these criteria (Dressler 1997), as summarized 
in Table 8.1. 
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Criteria and Highly productive Classes with Unproductive 
examples classes intermediate classes 
productivity 
Criteria 
Apply to YES NO NO 
loanwords 
Attract class- YES NO NO 
shifting 
lexemes 
Can form YES YES NO 
neologisms 
Examples 
Welsh plurals — -au, -iaid -oedd -edd 
Italian nouns -0/i -afi, -e/i — 
Russian nouns -C (I), -«(IT) -o (IV) -ja 


Table 8.1 Three degrees of inflection-class productivity 


Only highly productive classes are able to accommodate loanwords and 
attract lexemes from other, unstable classes. Productively formed neologisms, 
by contrast, often go into classes with intermediate productivity. Completely 
unproductive classes do not get new members at all, and, since they inevitably 
lose some members (e.g. when a word becomes obsolete), they are ultimately 
doomed to disintegration unless the productivity of the class changes. 

For exemplification, let us go back to Welsh plurals (King 1993: 52-64). As 
shown in (8.10), Welsh has several highly productive plural classes that can 
accommodate loanwords from English — for instance, the suffix -au, which 
is the most common Welsh plural suffix (e.g. siop/siopau ‘shop(s)’, trén/ 
trenau ‘train(s)’), or -iaid, which is often used with nouns denoting persons 
(e.g. doctor/doctoriaid 'doctor(s)', biwrocrat/biwrocratiaid "bureaucrat(s)^). 
Both these classes also apply to regularly formed neologisms. Thus, 
-au is always used with quality nouns in -deb (e.g. ffurfioldeb ‘formality’, 
ffurfioldebau ‘formalities’), and -iaid is always used with agent nouns in -dur 
(e.g. pechadur ‘sinner’, pechaduriaid ‘sinners’). The class in -au also shows 
its productivity in attracting members of other classes — e.g. from the 
class of plurals in -oedd. For example, amser 'time' has an older plural 
amseroedd and a newer plural amserau, and cylch ‘circle’ has an older 
plural cylchoedd and a newer plural cylchau. The plural class in -oedd is 
thus losing members, but it has at least intermediate productivity in that 
productively formed place-nouns in -fa have -oedd plurals (e.g. meithrinfa 
‘nursery’, meithrinfaoedd ‘nurseries’). Completely unproductive is, for 
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instance, the plural suffix -edd of bys/bysedd 'finger(s)', as well as the various 
classes of vowel-changing plurals (e.g. fford/ffyrd ‘road(s)’, asgell/esgyll ^wing(s)"). 

In Italian nouns, the -o/-i and -a/-e inflection classes (see (8.14)) are highly 
productive: they are used with loanwords (e.g. il chimono, plural i chimoni 
‘kimono(s)’, la giungla, plural le giungle ‘jungle(s)’), and occasionally they 
attract members from other inflection classes in non-standard varieties 
of Italian (e.g. standard language il pane ‘bread’ becomes il pano ‘bread’, 
la moglie becomes la moglia ‘wife’). The class in -a/-i (poeta/poeti “poet(s)’) 
cannot be used with loanwords: the noun lama ‘Tibetan monk’ does not get 
the plural -i (*i lami ‘lamas’) but remains unchanged in the plural (i.e. it joins 
the class of indeclinables, like all consonant-final loanwords). However, 
the -a/-i class is not totally unproductive, as it is used with the productive 
suffix -ista (e.g. leghista 'tollower of the Lega', plural leghisti). There is no real 
unproductive class in Italian, unless one regards the few irregular nouns 
(uomo/uomini ‘man/men’, bue/buoi ‘ox(en)’, etc.) as classes of their own. 

In Russian, two highly productive classes have been absorbing many 
lexemes from other inflection classes over the past millennium. Loanwords 
become class I if they end in a ‘hard’ consonant (komp'juter), or class II if 
they end in -a (disketta). But by contrast, class IV (consisting almost entirely 
of neuter nouns ending in -o or -e) is not highly productive - even loanwords 
ending in -o (such as pal'to ‘coat’ from French paletot) do not follow this class 
but are indeclinable. However, the class still gets new members through 
productive suffixes like -stvo, which creates abstract nouns (e.g. professorstvo 
professorship’). There is a small class of neuters in -ja (e.g. vremja ‘time’) 
that is totally unproductive. 


8.3 Paradigmatic relations and inflection class shift 


In the rest of this chapter we explore why it can be useful, and in some cases 
necessary, to look at inflection classes from a paradigmatic perspective. 
We begin by using the word-based model from Section 3.2.2 to develop a 
formal description. 

In a word-based description, the relation between the inflected forms of 
a lexeme can be seen as parallel to the relation between two derivationally 
related lexemes. Thus, the relation between horti ‘garden, NoM.PL’ and 
hortorum ‘garden, GEN.PL’ can be characterized by the rule in (8.16). The 
full form of the rule is given in (8.16a), and (8.16b) shows an equivalent 
abbreviated notation. 


(8.16) a. [ /Xi/n / Xorum/N 
^x^ o ‘x! 
CASE: | NOMINATIVE CASE: GENITIVE 
NUMBER: PLURAL NUMBER: PLURAL 


b. [/Xi/Nom.pt] © [/Xorum/GEN.PL] 
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The fact that there is no Latin noun with a nominative plural in -1and a 
genitive plural in -uum is thus expressed by the non-existence of a rule that 
would link these two suffixes. Thus, a correct genitive plural form can be 
created on the basis of other word-forms in the paradigm, and in fact every 
form can be created on the basis of every other form. Since there are ten 
forms in the paradigm, we can posit 45 pairwise rules like (8.16). 

Now recall that, even in derivational morphology, there is sometimes 
reason for positing rules that involve more than two word-schemas. In 
Section 3.2.2 we saw that, in addition to the English rules in (8.17a,b), we 
also need the rule in (8.18) because there are word families that contain two 
derived forms, but not the root (e.g. illusion, illusive, *illude). 


(8.17) a. |/X/v| © |/Xion/N b. |/X/v| © |/Xive/a 
‘dox’ 'action of doingx ‘dox’ ‘prone to doingy’ 
(8.18) |/Xion/N € |/Xive/a 


‘action of doingx’ ‘prone to doing’ 


These three rules are more properly described as a single rule relating 
three word-schemas. This can be represented using a further notational 
convention: sets of corresponding word-schemas are enclosed in curly 
brackets and separated by commas. Thus, (8.19) is a convenient notation 
for the combination of (8.17) and (8.18). 


(8.19) || /X/v /Xion/N 
‘dox’ 'action of doingx 


/ Xive / ^ 
‘prone to doingy’ 


, L 


If we adopt this formalism, we can formulate the rule in (8.20), which 
contains 10 corresponding word-schemas, to describe the Latin paradigm.” 


(8.20) ([/Xus/NoM.sc], [/Xi/GEN.SG], [/ Xo/ nAT.sc], [/Xum/ Acc.sc], 
[/X0/ ABr..sc], [/Xi/Nom.PL], [/X6rum/GEN.PL], [/Xis/DAT.PL], 
[/X6s/Acc.PL], [/Xis/ ABL.PL] } 
In what follows, we will call rules like (8.20) paradigm rules. Such word- 
based rules capture the generalization that inflectional markers come in 
packages. 

This approach is useful as a way to explain class shifts. For example, in 
later Latin quite a few nouns of the u-declension shifted to the o-declension 
- e.g. sendtus ‘senate’ (older genitive form sendtis, newer genitive senti), 
exercitus ‘army’, früctus ‘fruit’. How can class shift be explained? First, 


5 This word-based rule is just a notational variant of the paradigms found in Latin school 
grammars. Latin school grammars usually give a concrete lexeme like hortus, but everyone 
understands that hortus is just an example and really stands for /Xus/. Thus, the word- 
based description is just a somewhat more explicit variant of what school grammars have 
long been doing. 
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we noted earlier that not all word-forms are equally good indicators of 
inflection class membership. The nominative singular and genitive singular 
together can uniquely identify the class that a Latin noun belongs to, but 
the nominative singular by itself cannot. So to explain this shift, we need 
assume only that the innovating speakers did not remember the genitive 
form of these nouns for some reason (perhaps because it is infrequent). 
Now if they remember only the nominative form, the word matches two 
paradigm rules — i.e. it could belong either to the o-declension or to the 
u-declension. In such situations of choice, speakers tend to opt for those rules 
that generalize over more items. Latin always had many more o-declension 
nouns than u-declension nouns, so that the o-declension rule was stronger. 
This explains why shifts from the u-declension to the o-declension were 
common in Latin, but shifts in the opposite direction did not occur (see 
Wurzel 1987: 79). 

A key observation for explaining inflection class shift is thus that the 
form of a word is indicative of its inflection class membership, which is the 
same as saying that speakers use one word-form in a paradigm to predict 
another. This is a paradigmatic relation. A purely syntagmatic approach 
(such as the morpheme-based model) would have trouble describing this 
generalization, but word-based rules are naturally well suited to accounting 
for this data because they incorporate and emphasize the paradigmatic 
dimension of (inflectional) morphological structure. 


8.4 Inheritance hierarchies 


The value of the paradigmatic perspective also emerges when we examine 
similarities across inflection classes. From what we have said so far, one 
might get the impression that inflection classes may differ arbitrarily in the 
kinds of markers that they exhibit. But in fact different inflection classes 
often show great similarities, to the point where it is unclear whether a 
separate inflection class needs to be set up. Let us consider seven important 
inflection classes of Modern Greek nouns, shown in the traditional way in 
(8.21). (To simplify the presentation, stress is ignored here, even though it is 
relevant to establishing inflection classes in Greek.) 


(8.21) os-declension as-declension us-declension 
SG NOM nomos pateras papus 

ACC nomo patera papu 

GEN nonu patera papu 
PL NOM nomi pateres papudes 

ACC nomus pateres papudes 

GEN nomon pateron papudon 


‘law (masc.)’ ‘father (masc.)’ ‘grandfather (masc.)’ 
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a-declension i1-declension i2-declension  u-declension 
SG NOM imera texni poli maimu 
ACC  imera texni poli maimu 
GEN  imeras texnis poleos maimus 
PL NOM imeres texnes polis maimuóes 
ACC  imeres texnes polis maimudóes 
GEN  imeron texnon poleon maimuóon 


‘day (fem.) ‘art, skill (fem.)’ ‘town (fem.) ‘monkey (fem.)’ 


In the more abstract notation of our paradigm rules, these could be written 
as (8.22). 


(8.22) a. Paradigm rule for the os-declension 
{ [/Xos/Nom.scG], [/X0/ Acc.sc], [/Xu/GEN.sc], 
[/Xi/NoM.PL], [/Xus/acc.PL], [/Xon/GEN.PL] } 


b. Paradigm rule for the as-declension 
{ [/Xas/NoM.sc], [/Xa/ Acc.sc], [/Xa/GEN.SG], 
[/Xes/ voM.?r ], [/Xes/Acc.PL], [/Xon/GEN.PL] } 


and so on.$ 


None of the seven classes in (8.21) is completely identical to any other class, 
but the similarities among them are evident. Theoretically, given seven 
different declensions and six cells in the paradigm, we could have (6 x 
7 =) 42 totally different suffixes. In reality we have almost the opposite: 
the declensions seem to differ only slightly from each other. One might 
even propose that some of them could be lumped together, especially the 
a-declension and the i1-declension. 

In order to express this generalization, we will introduce one additional 
descriptive device: the rule-schema, which generalizes over rules in much 
the same way as word-schemas generalize over words. Thus, given the 
paradigm rules for the a-declension and the i1-declension in (8.23), we can 
formulate the rule-schema in (8.24), which subsumes both rules. In addition 
to the stem variable X, this also contains the variable V for the vowel, which 
may be instantiated by a or i. 


(8.23) a. Paradigm rule for the a-declension 
{ [/Xa/NoM.sc], [/Xa/ Acc.sc], [/Xas/ GEN.sG], 
[/Xes/NOM.PL], [/Xes/Acc.PL], [/Xon/GEN.PL] } 


b. Paradigm rule for the i1-declension 
{ [/Xi/NoM.sc], [/ Xi/ Acc.sc], [/Xis/GEN.SG], 
[/Xes/NOM.PL], [/Xes/Acc.PL], [/Xon/GEN.PL] } 


6 There is no point in rewriting all the paradigms of (8.19) in this format, because the tabular 
format is more perspicuous than the format with brackets and subscripts. 
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(8.24) Rule schema for (8.23a-b) 
{ [/XV / NoM.sc], [/ XV / Acc.sc], [/ XVs/GEN.sc], 
[/Xes/NOM.PL], [/Xes/Acc.PL], [/Xon/GEN.PL] } 


To make the notation more reader-friendly, let us introduce the formalism 
in Figure 8.1, where the slashes for the phonological representation and 
the inflectional values are omitted for the sake of simplicity. In this figure, 
the two declensions and the rule schema are shown in a tree format, the 
standard format for representing taxonomic hierarchies. In effect, the 
a-declension and the i1-declension are subtypes of the declension described 
by the rule-schema of (8.24), in much the same way as, say, a violin and a 
cello are subtypes of stringed instruments, and these are again a subtype 
of musical instrument. The taxonomic hierarchy of declension classes is 
completely parallel to hierarchies of this familiar kind, and is called an 
inheritance hierarchy. 


XV Xes 
XV Xes 
XVs Xon 


Figure 8.1 A hierarchy of declension classes 


In this formalism, those pieces of information that are identical in the rule 
schema and in the individual rule need not be specified twice. They can be 
specified once in the rule-schema, and the individual paradigm rule can 
inherit the information from the superordinate node in the hierarchy (hence 
the name inheritance hierarchy). This is symbolized by the use of boldface 
and normal print in Figure 8.1: boldface information is necessary, and 
normal-print information is redundant and could in principle be inherited 
from the superordinate node. (If we wanted a completely redundancy-free 
representation of grammatical information, normal-print material could 
simply be omitted. However, as we saw earlier in the discussion of word 
storage (Section 4.3), lack of redundancy does not seem to be a priority for 
human memory.) 

We can also extend this hierarchy to subsume the other Modern Greek 
inflection classes that we saw earlier. Figure 8.2 shows an attempt to draw 
a single inheritance hierarchy for the seven classes of (8.21) that has four 
different levels of abstractness. 
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XVZ XVZ 
XV XVs 
XVZ Xon 


XV XVs 
XV XVs 
XVs Xon 


XVs Xes Xi Xis 
XV Xes Xi Xis 
XV. Xon Xeos Xeon 


Xus — Xuóes Xa Xes i Xu Xuðes 
Xu Xuðes Xa Xes i Xu Xuðes 
Xu Xuðon Xas Xon Xus Xuðon 


Figure 8.2 An inheritance hierarchy for seven Modern Greek declension classes 


The top-level rule-schema is so abstract that it consists almost exclusively 
of variables (X for the stem, V for a vowel following the stem and Z for 
anything else, including zero, that follows that vowel). The only concrete 
elements that all classes share are the genitive plural suffix -on and the last 
consonant of the accusative plural suffix (-s). The major split is between the 
masculine classes (-os, -as, -us), on the one hand, and the feminine classes 
(-a, -i1, -i2, -u), on the other: all masculines are characterized by an -s in 
the nominative singular, and all feminines are characterized by an -s in the 
genitive singular. 

The inheritance network allows us a flexible and sophisticated answer 
to the question of how many different inflection classes should be set up 
for the Modern Greek data in (8.21). At the lowest level, there are seven 
classes, and we may call these microclasses. At an intermediate level, we 
might say that there are four classes (some of them with subclasses), and at 
a higher level, we could say that it has just two macroclasses, the masculine 
and feminine declension types. 

Now, the hierarchy in Figure 8.1 is just a single tree with no cross- 
classification, but in reality such cross-classifications are necessary, and 
examples are easy to find. This is again parallel to other domains of 
knowledge. One could cross-classify musical instruments into classical 
instruments (violin, cello, flute) and modern instruments (saxophone, 
electric guitar). One obvious generalization that is missed by Figure 8.2 but 
that is certainly not lost on speakers of Greek is the similarity between the 
us-class and the u-class, so for this, cross-classification is required. 
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In the hierarchy of Figure 8.2 there is also never a conflict between a lower 
and a higher node; higher nodes are merely less specific. Now it has been 
suggested that such conflicts should be allowed, and that specifications in 
a lower node should be able to override specifications in a higher node. For 
example, the Greek os-declension and the a-declension could be subsumed 
under the same rule schema as shown in Figure 8.3. 


Xos  (IXi 
Xo (I)Xus 
Xv Xon 


Figure 8.3 An inheritance hierarchy with a mismatch 


Here there is a mismatch between the nominative and accusative plural 
forms /Xi/ and /Xus/ and the corresponding forms specified in the higher 
node (/Xes/). The exclamation mark in the notation shows that a higher 
specification is overridden. The forms /Xes/ in the higher rule schema are 
no longer fully schematic, but they are a default that applies unless it is over- 
ridden. By using the device of default rules and overrides, the inheritance 
hierarchy can be simplified. Thus, in Figure 8.2 one of the rule schemas 
could be dispensed with if the description of Figure 8.3 were adopted. 

So in the end, the tree structure and inheritance are not by themselves 
always sufficient to capture similarities among inflection classes, but they 
do help us to see that inflection classes can be related to each other to greater 
or lesser degrees. 

By setting up hierarchies as in Figures 8.2 and 8.3, our formal description 
establishes relationships between entire inflection classes along a 
paradigmatic dimension. Moreover, we know that this dimension is salient 
for Greek speakers (and not just the creation of linguists) because of inflection 
class changes over time. For instance, the Modern Greek i2-declension used 
to have the ending -is in the nominative singular ({ [/Xis/Nom.scl, [/Xi/ 
Acc.sc], [/Xeos/ceEn.scl, ...] ), e.g. polis/poli/poleos ‘town’. The change from 
/ Xis/ to /Xi/ in the nominative singular was clearly a morphological, not 
a phonological change. The paradigm rule of the i2-declension clashed with 
the general schema for the other feminine microclasses in an important 
respect (the nominative singular in -is), and, by changing this, that schema 
was able to subsume the rule for the i2-declension as well. If the speakers 
had had only the rules for the individual declensions, this change would 
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be mysterious. Thus, diachronic change indicates that speakers establish 
paradigmatic relations across inflection classes, and that paradigmatic 
relations are therefore part of the architecture of the morphological system. 


8.5 Stems and Priscianic formation 


In this section we show that not only inflectional endings, but also stems 
may be related along the paradigmatic dimension. 

In many languages, lexemes are associated with multiple inflectional 
stems, i.e. there is weak or strong stem suppletion. Consider the Persian 
verb forms in (8.25). 


(8.25) INFINITIVE PAST TENSE PRESENT TENSE 
mund-en 1sG mund-cem mi-mun-am 
‘to stay’ — 2sc mund-i mi-mun-i 
3sa mund mi-mun-e 
lrL mund-im mi-mun-im 
2PL mund-id mi-mun-id 


3PL mund-end  mi-mun-cend 
(Mahootian 1997: 28, 236) 


All past-tense forms and the infinitive share a stem (mund-), and all present- 
tense forms share another (mun-). All Persian verbs behave like MUNDAN in 
this respect, as the verbs in (8.26) show. 


(8.26) INFINITIVE 1ST SG PAST 1ST SG PRESENT 
mund-on mund-cem mi-mun-cem ‘stay’ 
xeerid-cen xeerid-cem mui-xcer-cem ‘buy’ 
mord-cen mord-cem mi-mir-aem ‘die’ 
Sekaft-cen Sekaft-cem mi-Sekaf-aem ‘split’ 
Setaft-aen Setaft-aem mi-Setab-aem ‘hurry’ 
did-cen did-cem mi-bin-cem ‘see’ 


(Mahootian 1997: 231-7) 


The relation between the past-tense stem and the present-tense stem is 
unpredictable for many verbs, but the past stem and the infinitival stem are 
always identical. Moreover, because the relationship between the present 
and the past/infinitive sometimes involves suppletion, it is natural to say 
that lexemes are associated in the lexicon with two stems that are restricted 
to occurring with particular inflectional values. 

Now, in the context of this chapter’s discussion, one of the more interesting 
facts about inflectional stems is that they can sometimes be built on other 
stems in the same paradigm. A well-known case in Latin is the past passive 
participle and the future active participle. Some representative forms are 
given in (8.27). 
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(8.27) INFINITIVE PAST PASS. PART. FUTURE ACT. PART. 


laudare laudatus laudatirus ‘praise’ 
monére monitus monitürus ^warn' 
dücere ductus ductürus ‘lead’ 
vehere vectus vectürus ‘carry’ 
mittere missus missiirus ‘send’ 
haerére haesus haesürus ‘stick’ 
premere pressus pressürus ‘press’ 
ferre latus latürus ‘bear’ 


(Aronoff 1994: ch. 2) 


We could say here that each lexeme is associated with a set of three stems 
(e.g. lauda-, laudat-, laudatir-). But two facts are noteworthy. First, the future- 
active participle stem is the same as the past-passive participle stem with 
-ür added. Second, the relationship between the infinitive and the past- 
passive participle is sometimes suppletive, but the relationship between the 
past-passive participle and the future-active participle is always regular. 
An alternative would thus be to describe the future-active participle in 
terms of Priscianic formation (so called because it was used by the Latin 
grammarian Priscian, in the sixth century cE), whereby a member of an 
inflectional paradigm is formed from another member of the paradigm to 
which it need not be closely related semantically. A Priscianic analysis of 
Latin would say that the form of the future-active participle is dependent 
upon the form of the past-passive participle. This can be represented with a 
word-based rule as in (8.28). 


(8.28) [/XY/PstT.PASS.PART] €» [/XürY /FUT.ACT.PART] 


This rule nicely captures the regular relationship between the past-passive 
participle and the future-active participle. (Note that the meaning is quite 
independent of the form; obviously the future-active participle cannot be 
based semantically on the past-passive participle.) 

A description in terms of Priscianic formation is equivalent to a 
description in terms of stem sets for most purposes, and most linguists have 
continued to describe examples like (8.27) in terms of stem sets. (It is, after 
all, the approach that is consistent with the more dominant syntagmatic 
perspective.) But even for these linguists, such an analysis is perhaps less 
attractive in cases like the following from Tümpisa Shoshoni. This language 
has two non-nominative case forms, an objective case and a possessive case. 
The formation of these cases is illustrated in (8.29). 


(8.29) NOMINATIVE OBJECTIVE POSSESSIVE 
mupin mupitta mupittan ‘nose’ 
tümpi tümpitta tümpittan ‘rock’ 
niimii nimi niimin ‘person’ 
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piammiitsi piammiütsia piammiitsian ‘baby’ 
kahni kahni kahnin ‘house’ 
(Dayley 1989: 185-6) 


The objective case is related to the nominative in various ways (one of 
them being identity to the nominative), but the possessive is systematically 
formed from the objective by adding a further suffix -n. If we wanted to 
describe this pattern without Priscianic formation, we would have to set 
up a non-nominative stem that yields the possessive form by addition of 
-n and the objective form by addition of nothing. Of course, this kind of 
purely syntagmatic description is perfectly possible, but nothing seems to 
be gained when compared to the Priscianic solution. 


8.6 Syncetism 


Not uncommonly, two word-forms in an inflectional paradigm are 
phonologically identical, or in other words, homonymous. For example, 
in the present-tense paradigm of German verbs, the third person singular 
and the second person plural, and the first and third person plural have the 
same form: 


(8.30) 1sc (ich) spiele ‘I play’ 
2sG (du) spielt ‘you(sc) play’ 
3sG (er/sie) spielt ‘he/she plays’ 
lpL (wir) spielen ‘we play’ 
2PL (ihr) spielt ‘you(PL) play’ 
3PL (sie) spielen ‘they play’ 


When the inflectional homonymy is systematic, we speak of syncretism, 
and homonymous forms of a paradigm are called syncretic. Syncretism is 
thus a kind of ‘mismatch’ between form and inflectional function - one form 
for two or more (sets of) inflectional values. We will see that syncretism is 
perhaps the strongest piece of evidence for paradigmatic relations as part 
of morphological architecture. However, since the issues are somewhat 
complicated, we discuss them at some length in this section. 


8.6.1 Systematic versus accidental inflectional homonymy 


We must first be able to distinguish between systematic and accidental 
homonymy. We will discuss four tests: systematicity across inflection 
classes, syntactic functionality, patterns of language change and whether 
the affected paradigm cells form a natural class (see Zwicky 1991). 

The first test considers the extent to which a pattern of homonymy is 
found in different inflection classes. The two pairs of homonymous forms 
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in (8.30) behave differently by this criterion. German has a small class 
of vowel-changing verbs that have a different stem vowel in the second 
and third person singular, e.g. gebe/gibst/gibt ‘give’, falle/fallst/fallt ‘fall’. In 
these verbs, the third person singular and the second person plural are not 
identical, because the vowel alternation is restricted to the third person 
singular (3sc gibt versus 2PL gebt, 3sc füllt versus 2PL fallt), but the first 
person and third person plural are still identical. In fact, the first person 
and the third person plural are identical in all German verb paradigms, 
including the suppletive paradigm of sein ‘be’ (singular: bin/bist/ist, plural: 
sind/seid/sind). So in this respect, the 1PL/3PL homonymy (spielen) is more 
systematic than the 3sc/2PrL homonymy (spielt). 

The syntactic test concerns an interesting syntactic property of syncretic 
forms: they can be used in situations where two conflicting syntactic 
requirements must be fulfilled simultaneously. One such construction is 
shown in (8.31a), where the verb spielt has to agree simultaneously with 
both coordinands of the disjunction. Now there are situations where the two 
requirements are in conflict, as in (8.31b), where the verb is supposed to agree 
both with ich (first person singular) and with du (second person singular). 
Since there is no verb form that can do this, the sentence is ungrammatical. 


(8.31) a. Entweder Ballack oder Klose spielt gegen Bulgarien. 
‘Either Ballack or Klose will play in the Bulgaria match.’ 


b. "Entweder ich oder du spiele/spielst gegen Bulgarien. 
'Either I or you(sc) will play in the Bulgaria match.' 


c. Entweder wir oder sie spielen gegen Bulgarien. 
'Either we or they will play in the Bulgaria match. 


d. *Entweder Bierhoff oder ihr spielt gegen Bulgarien. 
‘Either Bierhoff or you(PL) will play in the Bulgaria match.’ 


However, when the two requirements are first or third person plural, as 
in (8.31c), there is a way to resolve the feature-value conflict: the syncretic 
form spielen can serve simultaneously as a first person plural and as a third 
person plural form. In this, it contrasts with the two homonymous forms 
spielt '3rd sg' and spielt '2nd pl': as we see in (8.31d), the form spielt cannot 
resolve the feature-value conflict, and hence we say that, in the case of 
spielen, we have systematic homonymy (i.e. syncretism), whereas, in the 
case of spielt, we are dealing with accidental homonymy. The syntactic 
criterion shows that speakers treat the two syncretic forms as related. In 
the case of German verbs, this test gives the same results as does the first 
test: The identity of spielt (3sc) and spielt (2PL) is accidental, but the identity 
of spielen (1PL) and spielen (3PL) is systematic — i.e. in the latter case we are 
dealing with syncretism. 

The ability to resolve a feature-value conflict can be taken as a sufficient 
criterion for systematic homonymy, but it cannot be a necessary one 
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because sometimes there are no relevant syntactic constructions that would 
impose conflicting requirements. For instance, if we want to know whether 
the frequent homonymy of the English past tense and the past participle 
(as in forms like played, fed, thought) is systematic, we cannot apply the 
syntactic test, because there are no constructions in which a verb should 
simultaneously be a past tense and a past participle. It is true that, for the 
vast majority of verbs, these forms are homonymous, but in Old English 
they were distinct for all verbs, and the present-day homonymy could 
be explained in almost all cases by regular phonological changes. Thus, 
the homonymy might still be accidental for English speakers. However, 
here the diachronic criterion can be invoked: there are a few verbs whose 
past-participle form became identical with the past-tense form through 
morphological (analogical), not phonological change: stand/stood/stood (cf. 
Old English standan/stod/gestanden), sit/sat/sat (cf. Old English sittan/set/ 
geseten). The morphological change is a strong indication that, at least at 
the time of the change, the homonymy of the two forms was perceived as 
systematic by the speakers. 

Finally, when all else fails, we can reasonably guess that forms are 
systematically homonymous when they form a natural class, i.e. when 
they can be described by a single (set of) inflectional value(s), and they are 
the only word-forms that express that set. Consider the Lithuanian verb 
paradigm in (8.32) (present tense, indicative mood of sup- 'shake, swing). 


(8.32) 

SINGULAR PLURAL 
1ST supu supame 
2ND supi supate 
3RD supa supa 


Here the two syncretic cells are the third person singular and the third 
person plural and they form a natural class, which is to say they are the 
only word-forms expressing third person. Such syncretisms may be called 
natural syncretisms, and patterns of this sort are likely to not be accidental. 


8.6.2 Underspecification 


Once we have established that a case of identity is systematic and not 
accidental, we can return to the question of whether syncretism requires 
a paradigmatic perspective in order to be adequately described. This 
question can be rephrased in the following way: Are we really dealing 
with two different cells in the paradigm, and a rule stating that they must 
have identical form (a paradigmatic approach), or is there perhaps just a 
single form that simply does not distinguish the relevant values (a non- 
paradigmatic approach)? 
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Certainly, examples of natural syncretisms do not require the paradigmatic 
approach, and they are quite easy to describe. We can simply say that the 
Lithuanian third person form of the verb is supa — i.e. we do not have to mention 
the number feature at all. We can visualize this description by a representation 
in which the syncretic form occupies an enlarged cell, as in (8.33). 


(8.33) 


SINGULAR PLURAL 
1ST supu supame 
2ND supi supate 
3RD supa 


In the more formal representation format of (8.20), we would say that the 
paradigm of sup- is described by the paradigm rule in (8.34), in which 
nothing is said about the number feature for the form supa. 


(8.34) ([/Xu/1.sc], [/Xi/2.sc], [/Xa/3], [/Xame/1.pr], [/Xate/2pr] } 


Such a mode of description is called underspecification: we simply do not 
specify the value of certain features in the paradigm rule. For the syntactic 
rule of agreement that interacts with these inflectional values, this means 
that it should not require feature-value identity, but only feature-value 
compatibility. Both a singular and a plural subject NP are compatible with 
a form like supa, so the agreement relation works, even though supa is not 
specified for number. 

Sometimes an underspecification analysis is possible even when the 
syncretic cells do not constitute a natural class. Consider another example, 
Standard Arabic case inflection: 


(8.35) 
SINGULAR PLURAL 
NOM haywaan-un haywaan-aat-un 
GEN haywaan-in haywaan-aat-in 
ACC haywaan-an haywaan-aat-in 
‘animal’ ‘animals’ 


In the plural, the genitive and the accusative have the same form, and the 
usual analysis is that we are dealing with syncretism here (this genitive- 
accusative homonymy is found in all non-singular forms, so it is unlikely to 
be accidental). However, this is not an example of natural syncretism: only 
two of the three plural cells are syncretic. 

One possibility is to say that Arabic has a different case system in the 
plural that distinguishes only a nominative and an oblique case. In other 
words, one might propose (8.36): 
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(8.36) 
SINGULAR | PLURAL 
NOM | haywaan-un NOM | haywaan-aat-un 
GEN haywaan-in : 
OBL haywaan-aat-in 
ACC haywaan-an 


But most linguists would not adopt this description, because it would make 
the rules of syntax more complicated. Instead of saying that a direct object 
is in the accusative case, we would have to say that it is in the accusative 
case in the singular and in the oblique case in the plural. Or possibly, our 
model could specify that oblique is somehow compatible with accusative 
and genitive. 

Butasimplealternativeisto positthat the nominative plural formally bears 
the feature values NoM.PL, and the syncretic forms are underspecified for 
case, bearing only the value rL. This causes an apparent problem: both word- 
forms are then compatible with a syntactic rule that requires nominative 
plural, but only one of the words can appear in this context. However, we 
can easily solve this problem by assuming that the morphology provides to 
the syntax that word whose feature values are compatible in the most specific 
way. If the syntactic context requires a nominative plural, haywaanaatun 
meets these conditions in a more specific way than haywaanaatin because 
the former specifies both case and number values. This principle is a 
version of the elsewhere condition — more specific conditions apply before 
more general ones. The elsewhere condition is relevant to many areas of 
grammatical structure. 

Underspecification is thus a powerful tool for describing syncretism, but 
it is not restricted to the word-based model. The examples in this section can 
just as easily be described in the morpheme-based model; the Lithuanian 
example is given in (8.37). 


(8.37) a.[/u/ b. [/a/ 
Vo Vo 
PERSON:1ST [PERSON:3RD] 


NUMBER:SG 


As in the word-based rule in (8.34), the morpheme lexical entry in (8.37b) 
is underspecified for number. This highlights that underspecification does 
not require a paradigmatic perspective. And inasmuch as all examples 
of syncretism can be described as feature underspecification, both the 
word-based model and the morpheme-based model offer equally viable 
descriptions. But of course, the key word in the preceding sentence is 
‘inasmuch’. In the following section we show that many instances of 
syncretism cannot be described by underspecification. Such examples 
require a paradigmatic approach. 
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8.6.3 Rules of referral 


Consider the three Old Church Slavonic nominal inflection classes in (8.38). 
Only the case endings are given here in order to save space. 


(8.38) ii-class a-class i-class 

SG DU PL SG DU PL SG DU PL 
NOM -ŭ -a -i a -č -y do d -i 
ACC -ŭ -a -y -0 -č -y i -i -i 
GEN -A -u  -ü -y -u -ŭ -i —-djju d 
Loc -č -u  -éiü Č -u -axií -i  -ïľju -xü 
DAT -u -oma -omil -É -ama -amii -i -ima  -Ymi 
INST -omi -oma -y -0jo -ama -ami C6. -Ifma -imi 


Especially in the dual, we have a lot of syncretism: The nominative and 
accusative, the genitive and locative, and the dative and instrumental are 
systematically homonymous (this is true also of the other inflection classes 
not shown here). These syncretisms are clearly not natural syncretisms, 
because these three pairs of cases do not have any exclusive properties. 
Moreover, these examples cannot (in any obvious way) be described using 
the device of feature underspecification. If all three pairs of syncretic dual 
forms were underspecified for case, all six dual forms would have the same 
feature specification (i.e. they would be marked only for being dual). The 
elsewhere condition would not function because no one form would be 
associated with more specific feature values. 

For such cases, we need a special type of rule that says that several forms 
in the paradigm are identical. Such rules are called rules of referral. We can 
formulate the rule for the nominative-accusative dual as in (8.39). 


(8.39) |/X/N © |/X/N 
^NOM.DU' “ACC.DU’ 


This rule generalizes over all the paradigms of Old Church Slavonic. A rule 
of referral can thus be thought of as a kind of paradigm rule schema that 
relates two cells in the paradigm to each other. And as should be obvious by 
this point, this kind of rule encodes paradigmatic relations. 

That such rules of referral are real for speakers and not just thought up 
by linguists becomes clear when they trigger morphological changes. An 
example comes from Old High German (Wurzel 1987: 70-1). The paradigm 
of neuter nouns of the a-declension that must have existed in pre-Old High 
German is shown in (8.40). 


(8.40) SINGULAR PLURAL SINGULAR PLURAL 
NOM wort wort faz fazzu 
ACC wort wort faz fazzu 
GEN wortes worto fazzes fazzo 
DAT worte wortum fazze fazzum 


^word' "barrel 
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The original suffix of the nominative/accusative was -u, as is clear from 
comparative evidence. This suffix was lost by regular sound change in 
heavy-syllable words like wort, but it was preserved in light-syllable words 
like faz. Now apparently speakers formulated a rule of referral stating that 
the singular and the plural forms of the nominative and accusative are 
identical. This rule was originally based only on nouns of the wort class, 
but since the faz class had a much lower type frequency, it also came to be 
affected by this rule, and by the time of Old High German the paradigm of 
(8.40) had changed: the forms fazzu have been replaced by faz, in accordance 
with the rule of referral. 


8.7 More form-meaning mismatches 


Nobody is perfect, not even inflectional paradigms. In the previous section 
we encountered one way in which cells in an inflectional paradigm may 
have a mismatch between form and meaning; they may be identical to other 
cells in the paradigm. In this section, we look at two more ways in which 
inflection fails to correspond to the principle of one-form-one-meaning, and 
how description of these examples benefits from a paradigmatic approach. 


8.7.1 Defectiveness 


First of all, lexemes may simply lack word-forms. Lexemes with missing 
word-forms are called defective lexemes.” An example of a defective lexeme 
is the Italian verb incombere ‘be incumbent’, which lacks a past participle 
and therefore cannot be used in the compound past tense. In French, the 
verbs frire ‘fry’, déchoir ‘fall’ and clore ‘close’ lack an imperfective past tense. 
In English, many speakers feel that the verb forego sounds strange in the 
past tense (?? He foregoed/forewent treatment for cancer). In Russian, a number 
of verbs do not have a first person singular in the present/ future tense (e.g. 
pobedit’ ‘win, defeat’, *pobeZu), and a few nouns like mečta ‘dream’ lack a 
genitive plural form (*mečt). 

Defectiveness is surprising and interesting for a few reasons. First, it 
disturbs the functionality of the language. Sometimes one wants to say ‘I 
will win' in Russian, but the system does not allow it. Of course, speakers 
are not condemned to silence in such cases - there is always a way around 
the defective form. A Russian speaker can resort to the expression oderZu 
pobedu [gain victory,] 'I will gain a victory', and an English speaker can 


7^ Note that the term defectiveness is usually applied only to individual lexemes, not to entire 
inflectional values. For instance, English systematically lacks one-word forms to express 
the past-passive participle, but this is not normally called defectiveness. 
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avoid foregoed by choosing a semantically similar verb or phrase such as did 
without or sacrificed. 

Second, it is surprising that speakers can learn the negative fact that a 
lexeme lacks certain forms. Normally there is at least one productive pattern 
for each inflectional value, a default pattern that is used when no other 
pattern is remembered. But in defective lexemes, this default pattern is not 
used to ‘fill in’ missing forms. This suggests that defectiveness is not simply 
a situation in which speakers fail to learn the missing word-form. Rather, 
speakers learn that no word-form is used, not even the default. Moreover, 
in contrast to other irregular lexemes, which usually show a high frequency 
of use (see Section 12.3), many defective lexemes are rather rare. Some more 
examples of Russian verbs that are defective in the first person singular 
present/future are shown in (8.41). (In Russian, the perfective future and 
the imperfective present are formed in the same manner. The verbs are 
therefore defective in either the present tense or the future, depending 
on aspectual value.) The summed token frequency of all present/future 
tense forms is also listed, based on the modern (post-1950) subpart of the 
Russian National Corpus, a large corpus consisting mostly of written texts 
of various types (newspapers, magazines, fiction, religion and philosophy, 
law, technical and scientific works, letters, diaries, etc.).? 


(8.41) LEXEME EXPECTED 19G — PRESENT/FUTURE GLOSS 
FREQUENCY (IPM)? 

pobedit* *pobezu 30 ‘be victorious’ 
ubedit* *ubezu 23.2 ‘persuade’ 
oScutit^ *oščušču 7.3 ‘feel’ 
Cudit* *čužu 0.8 ‘behave oddly’ 
derzit” *derzu 0.6 ‘be imprudent’ 
umiloserdit’ *umiloserZu 0 ‘to take pity on’ 


We might posit that speakers learn that pobedit/ is defective simply by 
having many opportunities to observe that all of its word-forms are used 
- except the first person singular. But this does not seem to help explain 
how lexemes like derzit^ and umiloserdit’ are learned to be defective, and 
how defectiveness in these words can be stable for multiple generations. (A 
probable answer is that a kind of analogy is at work here; speakers learn that 
derzit’ is defective because other, similar verbs are also defective (Daland et 
al. 2007; Baerman 2008).) 

Finally, and relevant to the central discussion of this chapter, the most 
surprising thing about defectiveness is probably that it, too, sometimes 
exhibits paradigmatic dependencies. In Sections 8.2 and 8.3 we argued that 
the ability to predict one word-form from another is an important property 


* Russian National Corpus: http:/ /ruscorpora.ru/en/. Frequency list based on modern 
subcorpus: http://corpus.leeds.ac.uk/serge/frqlist/ (access: July 2010). 
? IPM = instances per million words of corpus 
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of inflection classes. Paradigm rules help explain, for example, why 
lexemes sometimes shift from one inflection class to another. Here we note 
that paradigm rules sometimes play a role in defectiveness as well — forms 
that cannot be accurately predicted may become defective. For instance, a 
handful of Spanish verbs are defective in the first person singular, e.g. abolir 
'abolish' (1sc: *abolo, *abuelo), and sometimes other cells in the paradigm. 
Crucially, the defective cells are exactly those which cannot be confidently 
predicted based on other word-forms of the same lexeme, and are not 
frequent enough to have been memorized by speakers (Albright 2003). 


8.7.2 Deponency 


Another phenomenon of paradigm mismatch is deponency, whereby a formal 
marker of an inflectional value is used in the ‘wrong’ function, i.e. to express 
a different value. Consider the Modern Greek active and passive forms of 
pléno ‘wash’ in (8.42a). This represents the inflectional pattern that most verbs 
follow. However, a handful of verbs like déxome ‘receive, accept’, eryázome 
‘work’, and érxome ‘come’ (8.42b), are semantically active, but nonetheless 
exhibit the inflection pattern that normally expresses the passive. 


(8.42) a. ACTIVE PASSIVE b. ACTIVE 
Isc  pléno plénome érxome 
2sG  plénis plénese érxese 
3sG pléni plénete érxete 
1r  plénume plenómaste erxómaste 
2PL  plénete plenósaste erxósaste 
3PL  plénun plénonde érxonde 


Verbs like érxome, which have a paradigm from a different value but not the 
meaning of that value, exhibit deponency. 

In a purely syntagmatic approach, deponency is difficult to describe 
adequately The morpheme-based model must treat -ome as meaning 
‘passive first person singular’, -ese as meaning ‘passive second person 
singular', and so on, because morphemes bear meaning in this approach. 
However, this principle runs into obvious problems with the deponent 
verbs. 

From a paradigmatic perspective this pattern is understood easily enough 
if we make one small assumption: paradigmatic relations akin to rules of 
referral can operate not only within paradigms, but also across inflection 
classes. The active in the deponent class can be specified as systematically 
taking the same inflectional endings as the passive in other inflection classes. 
(Proper rules of referral involve complete phonological identity, not only 
inflectional identity, but the same principle is at work here.) A paradigmatic 
approach thus offers a more intuitive description of deponency than is 
possible in a purely syntagmatic approach. 
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8.8 Periphrasis 


Not uncommonly, missing cells are not completely empty, but may be filled 
by multi-word phrases that express the needed concepts in a conventional 
way. For example, many English adjectives lack ordinary comparative 
forms in -er. We have warmer, nic-er, pretti-er, but we do not have *beautifuller, 
"interestinger, *activer. However, morphologists do not say that the lexemes 
beautiful, interesting and active are defective in lacking a comparative form, 
because there is a well-established convention for expressing the value: 
more beautiful, more interesting, more active. Such comparatives are called 
periphrastic, and the phenomenon is periphrasis. Another example comes 
from Romanian, where nouns inflect for an oblique case (e.g. prietenul 'the 
friend (NoM)’, prietenului ‘the friend (oBL)’, Ana ‘Ana (NoM)’, Anei ‘Ana 
(OBL)’). However, masculine personal names such as Petre lack an ordinary 
oblique case. In order to use them in a syntactic slot that requires the oblique 
case, a periphrasis involving the pronoun lui ‘him’ is used (lui Petre ‘Petre 
(OBL)’). 

These examples represent cases of lexical periphrasis, where certain 
(groups of) lexemes lack word-forms for certain values. But we also 
find paradigmatic periphrasis — i.e. cases in which entire word-classes 
lack word-forms for certain combinations of inflectional values. A well- 
known example of this comes from Latin verbs. The passive is sometimes 
expressed morphologically for verbs (e.g. when combined with present 
or imperfect meanings). Likewise, the perfect and the pluperfect are 
expressed morphologically when combined with the active. However, the 
combination of perfect/pluperfect and passive is periphrastic for all verbs. 
In (8.43) we see the third person singular forms of some tense-aspect-voice 
combinations of the verb scribere ‘write’. The perfect and pluperfect passive 
forms are expressed by the past passive participle plus the verb esse ‘be’. 


(8.43) 
PRESENT | IMPERFECT PERFECT PLUPERFECT 
ACTIVE scribit scribebat scripsit scripserat 
PASSIVE scribitur scribebatur scriptum est scriptum erat 


We should be careful to distinguish paradigmatic periphrasis from another 
kind of periphrasis that we may call categorial periphrasis. In categorial 
periphrasis, a given grammatical function is always expressed with a 
multi-word expression. For example, French is sometimes said to have a 
periphrastic future involving the auxiliary verb aller ‘go’, e.g. je vais faire 
‘Tm going to do’, tu vas faire ‘you're going to do’, il va faire ‘he’s going to 
do’, and so on. The crucial difference, as compared with (8.43), is that the 
French future is never expressed with a single word, whereas the Latin 


1844 CHAPTER 8 INFLECTIONAL PARADIGMS 


passive sometimes is. In contrast to paradigmatic periphrasis, then, such 
cases of categorial periphrasis have nothing to do with morphology, and 
the morphologist can ignore them. 

Periphrasis does not appear to exhibit paradigmatic dependencies 
in the same way that deponency and syncretism do. But it is significant 
that paradigmatic (ie. morphologically-relevant) periphrasis can only 
be identified in the context of other paradigmatic forms - the relevant 
inflectional values must somewhere in the paradigm be expressed with 
a single word. Moreover, some linguists have argued that paradigmatic 
periphrasis tends to exhibit morphological properties (e.g. non- 
compositional meaning), and periphrastic examples as in (8.43) therefore 
should be formally considered as part of a lexeme's paradigm, but there is 
no consensus on this point. 


Summary of Chapter 8 


Inflection classes have a number of properties. They are delineated by 
sets of suppletive inflectional allomorphs, which are typically linked 
to properties such as the phonological shape of the base, the lexeme's 
meaning (e.g. animacy, transitivity) and/or morphological properties 
such as the derivational pattern. Although inflection class and gender 
are clearly related, their relationship is complex. Inflection classes may 
differ in productivity, seen in their ability to apply to novel lexemes 
(loanwords or productively formed neologisms) and in their ability to 
attract class-shifting lexemes. 

The balance between syntagmatic and paradigmatic description 
of (inflectional) structure is a major issue for the description of 
morphological architecture. This chapter focused on evidence that 
a syntagmatic perspective is insufficient by itself; formal devices 
for a paradigmatic description of inflection are also needed. The 
primary evidence comes from inflection class-shift, similarities across 
inflection classes (described in terms of inheritance hierarchies), 
Priscianic formation (one stem being built on another stem in the 
paradigm) and form-meaning mismatches: syncretism (one form for 
two sets of inflectional functions), defectiveness (lack of a form for 
a given function), deponency (a form with an unexpected function), 
and to a lesser extent, periphrasis (a function is expressed by a multi- 
word phrase). 


Further reading 


Book-length studies on inflection include Matthews (1972), Carstairs (1987), 
Wurzel (1989) and Stump (20012). A typologically oriented overview article 
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is Bickel and Nichols (2007). For inflection classes and stems, see Aronoff 
(1994) and Blevins (2003), and for inheritance hierarchies, see Corbett and 
Fraser (1993) and Stump (2001b). For productivity, see Dressler (1997) and 
Bauer (2001b). 

Various mismatches between form and inflectional function have been 
studied individually. For syncretism (including discussion of rules of 
referral, underspecification and related approaches), see Zwicky (1985a), 
several of the papers in Plank (1991), Noyer (1998) and Baerman et al. (2005). 
The distinction between systematic and accidental homonymy is set out 
in Zwicky (1991). For periphrasis, particularly the debate about whether 
periphrastic constructions are morphological or syntactic, see Bórjars et al. 
(1997), Embick (2000), Ackerman and Stump (2004) and Kiparsky (2005). 
An overview of periphrasis is Haspelmath (2000). For deponency, see the 
papers in Baerman et al. (2007). For defectiveness and its relationship to 
paradigmatic dependencies, see Hansson (1999), Albright (2003), Sims 
(2006), Daland et al. (2007) and the papers in Baerman et al. (2010). 

For the relationship between gender and inflection class, see Corbett 
(1982, 1991), Aronoff (1994) and Spencer (2002). 


Comprehension exercises 


1. Using the rules given in connection with (8.11), form the perfective 
form of the following Tagalog verbs: 


root basic form with voice affix 
langoy lumangoy ‘swim’ 
wagayway wumagayway ‘wave’ 
takot matakot ‘be afraid’ 
uhaw mauhaw ‘be thirsty’ 
buhat buhatin ‘raise’ 
punit punitin ‘rip’ 

punas punasan ‘wipe’ 


2. Near the end of Section 8.4, we said that ‘in Figure 8.2, one of the rule 
schemas could be dispensed with if the description of Figure 8.3 were 
adopted’. Which rule schema could be dispensed with? What would 
the modified version of Figure 8.2 look like? 


3. Take a complete list of English ‘irregular verbs’ based on past tense 
formation and try to group them into small inflection classes. Which 
classes can be established? Which verbs must be said to be truly 
irregular — i.e. cannot be put into a class with some other verb(s)? 


4. Consider the following three inflection classes of Ancient Greek (only 
singular forms are given). Class (i) consists of feminines (like the Latin 


186 CHAPTER 8 


INFLECTIONAL PARADIGMS 


class of insula ‘island’), class (ii) consists of masculines denoting men 
(like the Latin class of poeta 'poet') and class (iii) mostly consists of 
masculines. The nouns of class (ii) originally inflected just like class (i). 


What may have motivated the change? 


(i) 


(ii) (iii) 


NOM  hemérü neanias philos 
Acc hemérün neanían phílon 
GEN hemérüs neaníou phílou 
DAT hemérüi neaníai phíloi 
'day' ‘young man’ ‘friend’ 


Consider the following four inflection classes of Russian nouns, and 
try to set up an inheritance hierarchy corresponding to Figure 8.2. (see 
Corbett and Fraser 1993). (Note that <y> and <i> stand for the same 
phoneme. Also, note that the spelling obscures the stem shape in (iii), 
and the following all stand for the phoneme /t/: «t^», «tj», and plain 


«t» when followed by «e» or <i>.) 


(i) (ii) (iii) (iv) 

‘law’ ‘room’ ‘bone’ ‘swamp’ 
NOM.SG zakon komnata kost boloto 
ACC.sG zakon komnatu kost boloto 
GEN.SG zakona komnaty kosti bolota 
DAT.SG zakonu komnate kosti bolotu 
INST.SG zakonom komnatoj ^—kost/ju bolotom 
LOC.sG zakone komnate kosti bolote 
NOM.PL zakony komnaty kosti bolota 
ACC.PL zakony komnaty kosti bolota 
GEN.PL zakonov komnat kostej bolot 
DAT.PL  zakonam komnatam  kostjam bolotam 
INST.PL zakonami komnatami kostjami bolotami 
Loc.PL zakonax komnatax kostjax  bolotax 


6. English has few cases in which syncretism could be observed. However, 
consider the present-tense and past-tense paradigms of be: 


I am was 
you(sc) are were 
he/she is was 
we are were 
you(PL) are were 
they are were 


Apply the criteria of Section 8.6.1 to determine whether second person 
singular are and were are systematically syncretic with plural are and were. 
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Exploratory exercise 


In Section 8.6 we saw a few examples of syncretism, a kind of mismatch 
between form and inflectional function. A question that we did not ask, 
however, is whether syncretism tends to affect one part of a lexeme’s 
paradigm more than another. In this exercise, you will explore cross- 
linguistic patterns of syncretism using the Surrey Person Syncretism 
Database (Baerman 2002). This database contains examples of person 
(-number-gender) syncretism in verbal inflection classes. The data come 
from a geographically and genetically diverse sample of 111 languages. You 
will use this data to develop an answer to the question above, and consider 
possible explanations for any observed patterns. 


Instructions 


Step 1: Develop a hypothesis and predictions. For instance, in a language 
in which verbs agree with controller nouns for three person values (first, 
second, third), and two number values (singular, plural), would you expect 
to find that the third person singular and third person plural are more 
commonly syncretic than are the third person singular and first person 
singular? Or the opposite pattern? Or no difference? What about the third 
person singular and the first person plural? Do you expect to find more 
examples of ^natural' syncretism, and if so, in which values of person/ 
number/gender/etc. features? In developing your predictions, you might 
find it helpful to inspect the examples of syncretism presented in this 
chapter. 

(Having three person values and two number values is common among 
verbs that agree with pronouns and nouns for person and number. But of 
course, verbs may express more or fewer than two number values, may 
agree also for gender, the first person plural and dual may be subdivided 
into inclusive (‘we, including the addressee’) and exclusive (‘we, not 
including the addressee’) forms, etc. Be sure to include other inflectional 
values in your predictions.) 

Based on what you have read in this chapter and elsewhere in the book, 
explain the rationale underlying your predictions. If you expect to find 
asymmetries in attested patterns of syncretism (more in one value than 
another), why do you think that such an asymmetry might exist? If you do 
not expect to find any asymmetrical patterns, why not? 

Step 2: Familiarize yourself with the data set. The Surrey Person 
Syncretism Database is available here (as of July 2010): www.smg.surrey. 
ac.uk/personsyncretism/index.aspx. Explore its content, structure, and 
the theoretical assumptions it makes about syncretism. Begin by reading 
the document ‘How to Use the Database’. Also, notice in particular that 
in results returned by a query, clicking the ‘Example’ button will show the 
entire relevant paradigm. 
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Step 3: Develop an appropriate classification/counting method. For 
instance, if you expect to find more syncretism among singular forms than 
among plural ones, or vice versa, it might seem obvious that you want to 
count the number of examples in the database with syncretism in the plural, 
and the number with syncretism in the singular. However, a single language 
may have more than one example of syncretism. Should each example be 
counted, or each language? (A question to consider here is whether multiple 
examples of syncretism in the same language are likely to be independent of 
each other.) Also, what if a language has syncretism between all plural cells, 
but also one singular cell; how should this kind of example be counted? 
There are no absolute ‘right’ answers to these questions. The important 
thing is that you develop precise criteria, justify them to the extent possible, 
and most of all, be consistent in applying them. 

Step 4: Collect data. Using the search interface for the Surrey Person 
Syncretism Database, collect data that can be used to support or reject your 
hypothesis. 

Step 5: Draw conclusions. Was your hypothesis supported? And more 
generally, what do the data suggest about syncretism? Are all cells in a 
paradigm equally likely to be syncretic, or are there asymmetries? Speculate 
about the reasons for any observed patterns. And as always, be sure to 
consider the (potential) influence of your research methodology on your 
results, especially the impact of the criteria that you created in Step 3. 

Step 6 (optional): This exercise previews discussion in Section 12.1.3. 
Do your results support the claims made there, or contradict them? Explain 
your reasoning. 


Words and 
phrases 


o far in this book we have pretended that it is easy to distinguish words 

from phrases. This has been possible because in the modern European 
writing system, the boundary between words is often indicated by a blank 
space. The segmentation of a sentence into word tokens thus seemed to be 
a straightforward matter — a word is surrounded by spaces. 

However, when we look closely, we find this procedure does not work 
very well. Not all writing systems indicate word boundaries. In Chinese, 
for instance, there are never blank spaces between characters. And even in 
languages that use the modern European writing system, the conventional 
spelling is occasionally ambiguous. Sometimes the spelling vacillates, as 
in English compounds (e.g. flower pot, flower-pot, flowerpot). Sometimes 
boundary symbols other than a blank space are used - for example, the 
apostrophe (as with the English genitive 's, e.g. Joan's book) or the hyphen 
(as with object pronouns in the French imperative, e.g. donne-le-moi 'give it 
to me’). Sometimes the same element is spelled differently under different 
circumstances. In Spanish, weak object pronouns are spelled separately 
when they precede the verb (e.g. lo hacemos ‘we do it’), but together with the 
verb when they follow it (hacerlo ‘to do it’). Also, in German the infinitive 
marker is spelled separately in most cases (e.g. zu bringen ‘to bring’), but 
together with the verb when it is preceded by a prefix such as ein- ‘in’ (e.g. 
einzubringen ‘to bring in’). 

So in short, the hints from the spelling can be contradictory and 
misleading, and we cannot rely on the writing system of a language when 
trying to determine whether an expression is a word, a phrase, or an 
affix. The rules for orthographic word division are to some extent simply 
traditional in languages with a long written history. And when a language 
is first written down, language users often disagree on where to put blank 
spaces between words, and when a conventional spelling is agreed on, the 
decisions are sometimes clearly arbitrary. To distinguish between words, 
phrases and affixes, we must develop other criteria. 
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In this chapter we begin by showing that words and phrases exhibit 
different properties, and that these can be used to identify word boundaries. 
In Section 9.1 we address a common area of difficulty — distinguishing 
compounds from phrases. We then go on to discuss a more complicated issue 
in Sections 9.2 and 9.3, namely, expressions that fall along the continuum 
between canonical affixes and canonical words. These are clitics. Lastly, we 
consider whether a distinction between words and phrases is important 
for a formal description of language structure. Fundamentally, this is a 
question of whether syntactic principles apply to word structure. The 
relationship between morphology and syntax arose already in Chapters 5 
and 7. In Section 9.4 we look again at this issue in the context of something 
called the Lexical Integrity Hypothesis. 


9.1 Compounds versus phrases 


A common situation in which we might ask the question whether an 
expression is a single word or a syntactic phrase involves (potential) 
compounds. For instance, are the expressions backboard, backdoor, back 
seat compounds or phrases? In this section we discuss some properties of 
compounds that allow us to distinguish them from phrases. 

In many cases, compounds are easy to tell apart from phrases with two 
content words. For instance, compounds may consist of two (or more) 
lexeme stems that are juxtaposed in a single word-form, and, when a 
language does not allow phrases consisting of two juxtaposed lexemes 
of those same word-classes, the combination must be a compound. For 
example, German Holzhaus [wood-house] must be a compound noun 
because two juxtaposed nouns cannot by themselves form a noun phrase 
in German. Also, Italian segnalibri [indicate-books] 'bookmark' must be a 
compound, because it is structurally not similar to a phrase with a similar 
meaning. (Italian has a phrase segna libri whose pronunciation is the same, 
but this is an imperative verb phrase and means ‘indicate books", so both 
syntactically and semantically it is clearly distinct from the compound 
segnalibri.) Occasionally compounds even have a special segmental marker. 
Thus, in Coast Tsimshian an -m- interfix between the two members indicates 
a compound, e.g. gyemg-m-dziws [light-INTF-day] ‘sun’, giiiinks-m-hoon 
[dry-iNrr-fish] ‘dried fish’ (Dunn 1979: 55). And we saw in (7.4) that the 
interfixes -s- and -en- are used in German to form compounds (Liebe-s-brief 
‘love letter’, Schwan-en-gesang ‘swansong’). 

However, there are also a great many cases in which compounds are quite 
similar to phrases with a similar meaning, and then we have to take a closer 
look in order to distinguish the two patterns. For example, Lango has an 
inalienable possessive construction with the order head-possessor that is 
expressed by simple juxtaposition (e.g. the syntactic phrases wi rwot [head 


9.1 COMPOUNDS VERSUS PHRASES 191 


king] ‘the king's head’, bad dàktàl [arm doctor] ‘the doctor's arm’). Now Lango 
has expressions that look like compounds at first blush, e.g. wan 3t [eye house] 
‘window’, dág bág [mouth dress] ‘hem’ (Noonan 1992: 115, 157-8). Their 
most striking property is that they are idiomatic- i.e. their meaning cannot be 
determined from the meaning of their constituents. Idiomaticity is a typical 
property of compounds. However, it is neither a necessary nor a sufficient 
criterion for identifying a compound. On the one hand, all languages with 
productive compounding have some compounds with compositional 
meaning (English examples are piano-tuner, brake cable, spring festival). On the 
other hand, not all idioms are compounds. Idioms like English spill the beans, 
French roulette russe ‘Russian roulette’ or German goldenes Zeitalter ‘golden 
age’ are formally just like ordinary syntactic phrases in the language, and the 
general assumption is therefore that they are idiomatic phrases. Thus, one 
might suspect that Lango expressions like wan 3t ‘window’, dóg bán ‘hem’ 
are simply phrases that happen to be semantically idiomatic. 

Now, in actual fact, this seems unlikely, because Lango also has clear 
compounds of the type N-N, e.g. dt cém [house-eating] ‘restaurant’, 1133 
Aim [oil sesame] ‘sesame oil’. These cannot be phrases, because 3f and 
133 are not inalienable nouns, and this kind of possessive construction is 
possible only with inalienable nouns such as kinship terms and body part 
terms. Thus, wan dt and dág bó are probably compounds. But the point is 
exactly that there can be ambiguity. 

So how can we distinguish a compound from a syntactic phrase when 
ambiguity arises? First, a semantic property of almost all compounds is 
that a dependent noun does not denote a particular referent but the entire 
class; in other words, a dependent noun in a compound is not referential 
but generic. For example, in the compound piano-tuner, the element piano 
cannot refer to a particular piano, but must refer to pianos in general. Generic 
meaning is also a general feature of dependent nouns in verb-headed N-V 
compounds (i.e. in noun incorporation), as the examples in (9.1)-(9.2) show. 
The (a) examples show a non-incorporated, phrasal version, and the (b) 
examples show an incorporated version of the sentence, with a generic 
interpretation of the incorporated noun. (Note in (9.1b) the absence of the 
second determiner ki, i.e. the one that serves to pick out particular wood in 
(9.1a). Likewise, the demonstrative marker -o is missing in (9.2b).) 


(9.1) Lakhota 
a. Wichasa ki cha ki kaksa-he. 
man the — wood the — chop-cont 
‘The man is chopping the wood.’ 


b. Wichasa ki Chg-káksa-he. 
man the — wood-chop-coNT 
‘The man is chopping wood.’ (Lit.: ‘The man is wood-chopping.’) 
(Van Valin and LaPolla 1997: 123) 
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(9.2) Ponapean 
a. I pahn kang wini-o. 
ISG FUT eat medicine-DEM 
‘I will take that medicine." 


b. I pahn keng-wini. 
1SG FUT eat-medicine 
'| will take medicine.’ (Lit.: ‘I will do medicine-taking.’) 
(Rehg 1981: 209-14) 


In syntactic phrases, by contrast, a noun is more typically referential, as in 
(9.1a) and (9.2a). 

However, generic interpretation is not a sufficient criterion by itself. A 
dependent noun in a noun phrase need not necessarily be referential. In the 
German phrase Haus aus Holz ‘house from wood’, Holz ‘wood’ can be just 
as generic as in Holzhaus ‘wood house’, the compound that we saw above. 
This means that we cannot conclude that the expression is a compound just 
because a dependent noun is generic. But, conversely, if a dependent noun 
is referential (as in Lango wi rwót 'the king's head', which refers to the head 
of a particular king), we can be fairly certain that the expression is a phrase 
and not a compound. 

Since the typical semantic properties of compounds are not unique to 
compounds, we often need additional phonological, morphological and 
syntactic properties to identify compounds when compound and phrase 
patterns are otherwise formally similar. In general terms, compounds 
exhibit greater phonological, morphological and syntactic cohesion than 
phrases. 

A well-known phonological criterion is stress. In English, each word 
has one main stress, so main stress on only one member of a compound- 
like expression suggests that it is a word. Thus, the expressions in (9.3a) 
are compounds, whereas those in (9.3b) are generally taken to be phrases. 
(As these examples show, word division in the spelling correlates only 
imperfectly with the criterion of stress.) 


(9.3) a. góldfish 
backdrop 
White House 


b. góld médal 
báckstáirs 
whíte kníght 


Stress is also one of the criteria that show that Lakhota incorporation 
((9.1b) above) is a compounding pattern: the expression Chákáksahe ‘wood- 
chopping' is phonologically cohesive in the sense that it acts as a single unit 
for the purpose of stress assignment. 

An example of a different kind of phonological cohesion comes from 
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Chukchi. In this language, compounding creates a single domain for vowel 
harmony. Within a compound, the vowels must either all belong to the 
set [i], [e], [u], or all belong to the set [ə], [a], [o]. Thus, when kupre-n ‘net’ 
occurs in a compound, it may have to be changed to kopra-n (e.g. polvonto- 
kopra-n ‘metal net’). In many better-known languages with vowel harmony 
(e.g. Turkish), compound nouns do not count as a single domain for vowel 
harmony, so an expression that does not show harmony could be either 
a phrase or a compound. But vowel harmony never applies across word 
boundaries, so when harmony does affect both lexemes, as in Chukchi, we 
can conclude that the expression is a compound. 

In some cases, morphological cohesion can give us decisive criteria for 
word status. In the relevant examples, a morphological pattern clearly takes 
the whole compound in its domain rather than just the head. Consider the 
English word sister-in-law, which for many speakers has the plural form 
sister-in-laws. The older form sisters-in-law, which has the plural suffix 
on the head noun, could be either a phrase or a compound noun (with 
the head serving as the morphosyntactic locus; see Section 7.2 for similar 
examples), but sister-in-laws can only be a compound. The plural suffix -s 
is semantically associated with the entire unit, and not only with law (it 
indicates multiple sisters, not multiple laws). And since the plural marker 
normally attaches only to words, not to phrases, sister-in-law must be a 
compound. Similarly, in Ponapean the aspectual suffix -(a)la attaches to 
verbs, as shown in (9.4a). 


(9.4) a. I kang-ala wini-o. 
1sG eat-cOMPL  medicine-DEM 
‘Icompleted taking that medicine, i.e. I took all of that medicine.’ 


b. I keng-winih-la. 
19G | eat-medicine-COMPL 
‘I completed my medicine-taking.’ 
(Rehg 1981: 214) 


The position of the completive affix in (9.4b) is evidence that keng-wini(h) is 
a compound verb. 

Where phonological and morphological criteria are not decisive, 
criteria of syntactic cohesion can differentiate between compounds and 
phrases. Most obviously, syntactic phrases and compounds differ with 
regard to separability: phrases are often separable, whereas compounds 
are inseparable. This means that other words cannot intervene between 
compound members. For example, Hausa has N-N compounds that 
resemble phrasal possessive constructions in that they show head- 
dependent order and a relation marker (-n (masculine)/-7) (feminine)) on 
the head, e.g. gida-n-sauroo [house-REL.M-mosquito] ^mosquito net'. There 
are no phonological or morphological properties that would distinguish 
such compounds from possessive phrases like gida-n Muusaa 'Musa's 
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house'. However, when an adjective modifies these expressions, it becomes 
clear that the compound is inseparable, whereas the phrase is separable. 


(9.5) a. gida-n-sauroo bàbba (*gidaa babba na sauroo) 
house-REL.M-mosquito big 
‘big mosquito net’ 


b. gidaa babba na Muusaa 
house big REL.M Musa 
^Musa's big house’ 
(Newman 2000: 109) 


Another clear indication of phrasal status is the expandability of the 
dependent element, because dependents in compounds cannot generally be 
expanded by modifiers such as adjectives or adverbs (e.g. English kingmaker 
versus “illegitimate kingmaker ‘someone who makes an illegitimate king’; 
crispbread versus “very crispbread ‘bread that is very crisp’). 

In compounds, the head noun cannot be replaced by an anaphoric 
pronoun.' For instance, English allows (9.6a), but not (9.6b). Silversmith 
must be a compound. 


(9.6) a. My aunt has one gold watch and three silver ones 
(i.e. three silver watches). 


b. *My aunt knows one goldsmith and three silver ones 
(i.e. three silversmiths). 


By contrast, in Japanese, complex verbal expressions like benkyoo suru 
[study do] ‘study’ and rakka suru [fall do] ‘fall’ are sometimes regarded as 
N-V compound verbs. However, the noun in these combinations can be 
omitted with an anaphoric interpretation. See (9.7), where the noun rakka 
does not occur in the response. This suggests that these expressions are 
phrases after all. 


(9.7) Sore wa rakka  si-masi-ta ka? | — Hai, si-masi-ta. 
it Top fall  do-POLITE-PST INT yes  do-POLITE-PST 

‘Did it fall? — Yes, it did.’ 
(Matsumoto 1996: 41) 


(The dependent noun in a compound cannot be replaced by an anaphoric 
pronoun either (*the king and the him-makers), but this is not very useful 
as a test. As we have seen already, the dependent noun is almost always 
generic. Anaphoric pronouns cannot be interpreted generically, so there is 
an independent reason for this failure.) 

Finally, phrases can exhibit coordination ellipsis, meaning that one of 
two identical elements in coordinated phrases can be optionally left out. 


!  Ananaphoric pronoun refers back to some noun that has already been introduced in the 


sentence or discourse. 
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By contrast, a compound member generally cannot be deleted in this way; 
compare (9.8b) to (9.9b). 


(9.8) 


(9.9) 


oP 


Flying fish must be a compound. 
Thus, compounds can be distinguished from phrases by semantic, 
phonological, morphological and syntactic criteria. These are summarized 


a. Large fish and small fish were mistakenly placed in the same tank. 
b. Large © and small fish were mistakenly placed in the same tank. 


Flying fish and small fish were mistakenly placed in the same tank. 
*Flying © and small fish were mistakenly placed in the same tank. 


in Table 9.1. 
Phrases Compounds 
semantic dependent noun may be dependent noun virtually 
referential always generic 
head may be replaced by head may not be replaced 
an anaphoric pronoun by an anaphoric 
pronoun 
phonological less cohesion greater cohesion 
e.g. compound as domain 
of stress assignment, 
vowel harmony 
morphological ^ no cohesion greater cohesion 
e.g. compound as 
domain of affixation 
syntactic separable inseparable 
dependent noun dependent noun not 
expandable expandable 
coordination ellipsis coordination ellipsis 
possible impossible 


Table 9.1 Phrases versus compounds 


While the discussion here has focused on compounds, some of these 
criteria (e.g. phonological and morphological cohesion, nonseparability 
and coordination ellipsis) apply to other types of words as well. These can 
thus be used as general tests for distinguishing words from phrases. Some 
examples that do not involve compounds will be presented below. 
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9.2 Free forms versus bound forms 


The boundary between words and phrases is not always clear - tests for 
wordhood may produce contradictory results. One of the more interesting 
complications is that an expression may be a word for the purposes of 
syntax, but not by itself an entire prosodic word (i.e. a word for purposes of 
stress assignment). These are clitics. In this section we compare clitics and 
affixes on the one hand to independent word-forms on the other. 

Clitics and affixes (collectively referred to as bound forms)! are similar 
in that they exhibit prosodic dependence. This means that they cannot 
by themselves constitute a domain for word stress — they must ‘lean’ on a 
prosodic host. By contrast, canonical word-forms (often called free forms) 
exhibit prosodic independence. This can be seen in several ways. 

First, an utterance may be interrupted at a boundary between two free 
forms, but notat a boundary between a bound form and its host. This is true 
for affixes (e.g. Paul ...started to play, or Paul started ... to play, but not *Paul start 
... ed to play), and also for clitics, as shown by the Croatian sentences in (9.10)? 


(9.10) a. Oni -su .. počeli igrati. 
they = AUX.3PL began.M.PL play.INF 
‘They began to play.’ 
b. *Oni... = su počeli igrati. 


Croatian clitics are prosodically dependent on the preceding word. The 
prosodic dependence of the clitic makes it impossible to pause between the 
clitic and the preceding word. 

Also, clitics never bear their own stress. In the example above, su does not 
(and cannot) bear stress at all. In the French imperative joue=le! ‘play it!’, 
the weak object pronoun clitic le bears stress (jowe='le), but this is the stress 
of the whole prosodic word (which happens to be on the final syllable), not 
le's own stress. 

Finally, in languages that use stress to express contrast, free forms can 
exhibit contrastive stress, whereas bound forms cannot. Thus, in English we 
can have PauL started to play, ox Paul started to PLAY, or Paul srARTED to play, but 
not *Paul starten to play, or *Paul started To play, because past-tense -ed and 
infinitival fo are prosodically dependent. 

Prosodic dependence can have syntactic consequences; free and bound 


Note that a bound form in this sense is not the same as a bound stem (Section 2.2), although 
the two terms are related. A bound stem is a base that cannot stand by itself in any way — it 
is not a complete word for the purposes of the syntax nor is it a complete prosodic word. 
The term bound form, as used here, refers only to lack of prosodic independence. It is thus 
a broader term that encompasses bound stems, affixes and clitics. 

? In this and other examples, we follow the convention of linking clitics to their hosts by an 
equal sign. 
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forms differ in the kinds of syntactic constructions in which they can appear. 
In languages like French, where contrast is expressed by clefting, free forms 
can be clefted, but bound forms cannot.* The sentence in (9.11a) can have the 
clefted variant (9.11b), but (9.12a) cannot have the clefted variant in (9.12b). 


(9.11) a. Paul commeng-ait à jouer. 
Paul begin-3sG.MPF to  play-INF 
‘Paul started to play.’ 
b. C' est Paul qui commeng-ait ü jou-er. 
it is Paul who begin-38G.IMPF to play-INF 


‘It’s PauL who started to play.’ 


(9.12) a. Il-commengait à jouer. 
‘He started to play.’ 


b. *C'est il qui commençait à jouer. 
‘It’s HE who started to play.’ 


Example (9.12b) is impossible in part because cleft constructions involve 
prominent stress on the clefted word or phrase, and the weak subject 
pronoun il is a clitic (i.e. not prosodically independent). Therefore, in the 
clefted variant of (9.12a), French uses its independent pronoun lui ‘he’ (C'est 
lui qui commengait à jouer). 

This difference between lui (free form) and il (bound form) extends to 
other syntactic constructions as well. For example, the bound forms are 
used in normal subject + verb constructions (as in je=joue ‘I play’, tu=joues 
‘you play’, il=joue ‘he plays’), but when the pronoun is topicalized, the free 
form is used (moi, je=joue ‘as for me, I play’, not "je, je=joue).° And likewise 
in coordination, the free form is used: moi et toi jouons ‘you and I play’, not 
*je et tu jouons. So, the bound form cannot be used when the pronoun is 
separated from a viable host and/or in a position that requires sentential 
stress. Cross-linguistically, free forms thus exhibit more syntactic freedom 
of movement, and movement tests like clefting and topicalization can be 
useful for distinguishing free forms from bound forms. 


9.3 Clitics versus affixes 


The contrast between free forms and bound forms is only half of the 
distinction that interests us. We also need tests that will distinguish clitics 
from affixes. 


^ Incleft constructions, a word or phrase is positioned outside of its clause for the purpose 
of creating focus. In English, it is X that Y (e.g. It is mary that ran a marathon, (not Bill).) is a 
typical cleft type. 

5  Topicalization is when a constituent is moved out of its phrase, usually to the beginning 
of the sentence, to indicate the topic about which new information will be added. 
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The first thing to note is that clitics exhibit notoriously heterogeneous 
behaviour. (In fact, clitics are diverse enough that some linguists use the 
term as a kind of ‘junk’ label for anything that is not quite a word and not 
quite an affix.) This means that it is impossible to identify traits that all 
clitics will exhibit, to the exclusion of affixes. Still, even allowing that there 
are many different sorts of clitics, there are a number of properties that are 
useful for collectively distinguishing them from affixes. 

Perhaps the most salient property of clitics is that they have freedom of 
host selection - i.e. a clitic can often occur with hosts of various syntactic 
categories, and its host need not be syntactically related to it. Example 
(9.13) shows that the English clitic =’s has freedom of host selection. 


(9.13) a. The person you were talking about='s walking over here. [preposition] 
b. Replacing the window you broke-'s going to cost a lot of money. [verb] 
c. That house down the street-'s going to sell quickly. [noun] 


Affixes do not have such freedom of host selection — they combine with 
stems to which they are syntactically related. 

Additionally, clitics may be less prosodically integrated with their hosts 
than are affixes. In other words, affixes are always within the domain of 
word stress, but clitics may or may not be. French joue-'le! ‘play it!’ is an 
example of word stress applying to the entire clitic group (the expression 
formed by one or more clitics and the host). Spanish exemplifies the opposite 
pattern, in which clitics are not prosodically integrated. In Spanish, stress 
is usually on the last or penultimate syllable of the word, and rarely on the 
antepenultimate (e.g. caminar "walk.INE', camina ^walk.PRs.3sc', caminábamos 
^walk.Psr.1PL, but never on the fourth syllable from the end. But this is 
possible with clitic groups, e.g. diga=me=lo ‘say it to mel, suggesting that 
in this language, the clitics are prosodically dependent on the host, but 
outside of the domain for stress placement. In this respect, then, Spanish 
clitics behave unlike affixes. 

Third, morphophonological rules are less likely to operate across the 
boundary between a host and a clitic than across the boundary between a 
stem and an affix. Certainly, some languages have rules that apply equally 
to affixes and clitics, including the rule of vowel harmony in Finnish. 
Bound elements like the suffix -nsa/-nsä ‘his’ and the clitic =ko/=ké (question 
marker) agree in backness with the last vowel of the stem or host (koira- 
nsa ‘his dog’, ystdvd-nsa ‘his friend’; koira-ko ‘dog?’, ystävä=kö 'triend?"). 
However, many languages have morphophonological rules that operate 
within the domain of the word-form, but not within the clitic group. For 
example, in Dutch obstruents are devoiced word-finally, and no such 
devoicing occurs when a vowel-initial suffix follows the same morpheme 


5 Orthographic <a>, <6>, «y», «o» and <a> correspond to International Phonetic Alphabet 
[æ], [e], [y], [o] and [a], respectively. The first three are front vowels; the last two are back 
vowels. 
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(see (9.14a)). However, when a vowel-initial clitic follows it, devoicing still 
occurs, as can be seen in (9.14b). Thus, the clitic is ‘invisible’ to the rule 
of final devoicing. 


(9.14) a. verband [vər bant] — verband-ig [vor bandrix] 
‘bandage’ ‘bandage-like’ 
b. ik brand [g brant] brand=ik [ brantik] 
‘I burned’ ‘I burned’ 


Similarly, in Ponapean there is a rule of vowel lengthening at the end of 
the word that does not apply when a suffix follows. However, when a clitic 
such as demonstrative et follows the noun, vowel lengthening still occurs 
(Ponapean spelling marks vowel length by the letter h): 


(9.15) sahpw ‘land’  sapw-ei ‘my land’ sahpw=et ‘this land’ 
ngihl 'voice' ngil-ei ‘my voice’ ngihl=et ‘this voice’ 
pwuhs ‘novel’ pwus-ei  'mynovel pwuhs=et ‘this novel’ 


(Rehg 1981: 169-70, 186) 


Fourth, affixes may trigger idiosyncratic suppletive alternations in the 
base, whereas clitics do not. For example, consider the Finnish words in 
(9.16). 


(9.16) nainen ‘woman’  naise-llinen ^woman-like, feminine’ 
naise-n ^woman's (GEN.SG)’ 
naise-lla ‘to the woman (ALL.SG)’ 
naise-nsa ‘his woman’ 


(Kanerva 1987: 506) 


In Finnish, many nouns alternate between a stem-final sequence -nen (e.g. 
nainen) and a sequence -se (e.g. naise-). The former occurs when the word 
is unsuffixed (i.e. in the nominative singular form), and the latter occurs 
when any kind of suffix follows, inflectional or derivational. But when a 
clitic follows the noun ‘woman’, the stem nainen is used (e.g. nainen=ko? 
‘the woman?’), showing that the clitics behave differently from affixes in 
this respect, and more like word-forms. 

Likewise, affixes may undergo idiosyncratic suppletive alternations, 
whereas clitics do not. For instance, Polish has several different inflection 
classes of verbs, and the first person singular suffix is either -m or -e, 
depending on the class (kocha-m ‘I love’, umie-m ‘lm able’, ucz-e ‘I teach’, 
pij-e ‘I drink’). Object pronouns, however, are clitics that attach after their 
hosts, and they have an invariable shape: go (kocham=go ‘I love him’, pije=go 
‘I drink it’, ucze=go ‘I teach him’, etc.). 

Fifth, affix-base combinations may have an idiosyncratic meaning, 
whereas cliticchost combinations never do. Idiosyncratic meanings of 
affixes are widespread in derivational morphology, but occasionally they 
are found in inflection as well, e.g. the Dutch inflected form ouder. In its 
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literal meaning this word is a comparative (‘older’), but has also been 
extended to use as a noun, with the idiosyncratic meaning 'parent' (see 
(5.14) for more examples). 

Sixth, expected affix-base combinations may arbitrarily fail to exist, 
whereas clitic-host combinations are always possible. 

And finally, it is often noted that clitics may have some syntactic freedom 
of movement, whereas an affix must always attach to its base and cannot 
move independently of it. The degree to which clitics do have freedom of 
movement is a somewhat complicated issue, and it is useful here to divide 
clitics into two categories: simple and special (Zwicky 1977). 

A simple clitic is one that can appear in the same syntactic positions as 
a corresponding free form. For instance, the clitic form 's appears largely in 
the same positions as the free form is, as shown in (9.17). 


(9.17) a. Replacing the window you broke-'s going to cost a lot of money. 
a’. Replacing the window you broke is going to cost a lot of money. 


b. It—'s going to cost a lot of money to replace the window you broke. 
b'. It is going to cost a lot of money to replace the window you broke. 


Of course, as we saw in the preceding section, clitics cannot be used when 
an accented form is required, or when there is no host to lean on (* =’s he 
going to replace the window?). Otherwise, however, simple clitics have the 
same freedom of movement as free forms. 

A special clitic is ‘special’ in the sense that its syntactic distribution differs 
from that of free forms and must be described in its own right. Special clitics 
typically have less freedom of movement than simple clitics, or even none. 
For instance, second-position clitics (also called Wackernagel clitics after 
the linguist who made them famous) appear after the first element of the 
(simple) sentence, which serves as the host. Depending on the language, 
the first element may be either the first stressed word, or the first syntactic 
constituent. The following examples come from Pitjantjatjara. 


(9.18) a. Tjitji-ngku =ni nya-ngu. 
child-ERG =ACC.1SG see-PST 
‘The child saw me.’ 


b. Tjitji nyanga pulka-ngku -ni nya-ngu. 
child this big-ERG =ACC.1SG see-PST 
‘This big child saw me.’ 


(Bowe 1990: 12) 


In this language, the accusative pronominal clitic ni must occur after the first 
syntactic constituent, as shown in (9.18).’ It cannot occur in other positions, 


7 [n the Pitjantjatjara writing system, the letters / and n correspond to IPA [1] (retroflex 
lateral) and [n] (retroflex nasal), respectively. These contrast with [l] and [n], which are 
written without the underline. 
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even where a free form pronoun is possible. Compare the sentence with 
free form of the pronoun in (9.19a) to the one with the equivalent clitic in 
(9.19b). 


(9.19) a. Trevor-lu | mukuri-nganyi Mary-lu — ngayunya helpamila-ntjaku. 
Trevor-ERG want-PRS.CONT Mary-ERG me help-PURP 
‘Trevor wants Mary to help me.’ 


b. "Irevor-lu  mukuri-nganyi Mary-lu =ni helpamila-ntjaku. 
Trevor-ERG want-PRS.CONT Mary-ERG —-ACC.1sG help-PURP 
(Bowe 1990: 72) 


The pronominal clitic thus has no freedom of movement. 

Serbian similarly has second-position clitics. An accusative clitic pronoun 
is shown in (9.20), and the free form pronoun in (9.21). The (c) examples 
are the crucial comparison: the full form can occur in third position but the 
clitic cannot, despite having a suitable prosodic host. 


(9.20) a. Marija | -ga voli. 
Maria =him loves 
‘Marija loves him.’ 


b. Voli =ga Marija. 
c. *Marija voli =ga. 
(9.21) a. Marija njega voli. 
Maria him loves 
‘Marija loves HIM.’ 
b. Voli njega Marija. 
c. Marija voli njega. 
Notably, however, Serbian clitics have limited freedom of movement under 
particular circumstances. 


(9.22) a. Marija želi da = joj = ga predstavi. 
Maria wants that F.DAT.SG M.ACC.SG introduces 
‘Marija wants to introduce him to her.’ 
b. (?)Marija = joj = ga Zeli da predstavi. 
Maria F.DAT.SG M.ACC.sG wants that introduces 


‘Marija wants to introduce him to her.’ 
(Franks and King 2000: 243) 


Here, the clitics joj ‘to her’ and ga ‘him’ are associated with the verb predstavi 
‘introduce’, and in (9.22a), they appear in second position in that verb’s 
clause (...da joj ga predstavi). In (9.22b), however, the clitics have ‘climbed’ 
into the higher clause, and appear in second position within it. Some 
speakers consider (9.22b) to be the less preferable version of the sentence, 
but many speakers accept both. Thus, when Serbian second-position clitics 
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appear in a lower clause they may have some freedom of movement, and in 
this respect they are unlike affixes. 

The criteria for distinguishing between affixes and clitics are summarized 
in Table 9.3. 


Clitics Affixes 

freedom of host selection no freedom of stem selection 

possible freedom of movement no freedom of movement 

less prosodically integrated more prosodically integrated 

may be outside the domain of a within the domain of a 
phonological rule phonological rule 

do not trigger /undergo may trigger /undergo 
morphophonological or morphophonological or 
suppletive alternations suppletive alternations 

clitic-host combinations... affix-base combinations... 


do not have idiosyncratic meanings may have idiosyncratic meanings 
do not have arbitrary gaps may have arbitrary gaps 


Table 9.3 Clitics versus affixes 


Overall, the data present us with the picture that clitics are like affixes in some 
respects, and like independent word-forms in others. Just as importantly, 
however, clitics do not themselves constitute a uniform group. All clitics are 
prosodically dependent on a host and have some freedom of host selection, 
but some clitics are prosodically or phonologically integrated with their 
hosts while others are not; some have special syntax, but others do not; 
and so on. This heterogeneous behaviour makes sense from a diachronic 
perspective. Inflectional morphology commonly arises from free words, 
and we can hypothesize that clitics represent the intermediate stages in this 
transition. Most likely, fast speech processes lead to reduced variants of 
already prosodically weak grammatical elements. These reduced variants 
are then susceptible to being reanalyzed by a new generation of speakers 
as distinct lexical expressions, rather than as straightforward instances of 
phonetic reduction. And over time, these clitics may acquire further affixal 
properties: reduced stem selection (e.g. if a clitic attaches predominantly to 
a single word-class, it may be reanalyzed as attaching only to that word- 
class), reduced freedom of movement, morphological and phonological 
cohesion, etc. 

Cross-linguistically, then, clitics can be expected to exhibit a wide 
range of syntactic, morphological and phonological properties. Of course, 
some clitics may never become canonical affixes. The development from 
a free word form to an affix is not a single change with a predetermined 
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outcome; it is better thought of as a series of small changes that collectively 
(and to some extent coincidentally) reduce the prosodic, syntactic and 
morphological independence of an expression. Not surprisingly, many 
of the facts surrounding clitics, and how they should be accounted for in 
formal description, are topics of ongoing debate in morphological theory. 


9.4 Lexical integrity 


We saw in the preceding sections that words differ from syntactic phrases 
in a number of crucial respects. We failed, however, to ask why these 
differences exist. We now turn to this issue. 

Many linguists have posited that the various differences in the behaviour 
of words and phrases reflect a single general principle, which can be 
formulated as follows: 


(9.23) Lexical Integrity Hypothesis: Rules of syntax can refer/apply to 
entire words or the properties of entire words, but not to the internal 
parts of words or their properties. 


The Lexical Integrity Hypothesis (also called the Lexical Integrity Principle) 
comes in many subtly different forms, but the core idea is that as far as 
syntactic rules are concerned, words have no internal structure. They are 
atomic. Moreover, (9.23) is a claim about the nature of language in general, 
so the generalization should apply to all languages in the world. 

The validity of the generalization is subject to empirical testing. So far, 
discussion in this chapter has focused on examples that are consistent 
with (9.23). It is not, however, sufficient to show that rules of syntax (e.g. 
movement, anaphoric replacement, coordination ellipsis) often fail to apply 
to the internal parts of words. The Lexical Integrity Hypothesis states that 
syntacticrules can never referto the internal parts of words. We musttherefore 
first ask whether languages ever violate the principle of lexical integrity. 

The Lexical Integrity Hypothesis can be evaluated in a meaningful way 
only if the notions ^word' and 'syntactic rule' are specified precisely. For 
instance, a clitic forms a prosodic word with its host, but (9.23) clearly does 
not hold over the prosodic word. If it did, clitics could have no independent 
freedom of movement - movement is a kind of syntactic rule, and a clitic is 
an internal part of a prosodic word. However, there is a simple resolution 
to this issue. We need only assume that lexical integrity holds over the 
morphosyntactic word, rather than the prosodic word, meaning that a clitic 
and its prosodic host are separate words in the relevant sense and thus not 
subject to (9.23) (Bresnan and Mchombo 1995). 

Hungarian exemplifies a more complex problem. As shown in (9.24), 
meg-old [Prv-solve] can serve as the input to deverbal noun derivation and 
deverbal adjectival formation. It is thus quite clear that constructions formed 
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with meg are single words in some sense. (The alternative analysis, that -ás 
and -hatatlan attach to old by itself seems unlikely because meg appears with 
nouns and adjectives only if they are deverbal.) 


(9.24) meg-old ‘solve’ 
meg-old-ás ‘solution’ 
meg-old-hatatlan ‘unsolvable’ 


Given this, it is somewhat surprising that meg can be separated from the 
rest of the verb, as shown in (9.25b). 


(9.25) a. Pál meg-old-ott-a a problémá-t. 
Paul prv-solve-PST-DEF.3SG the — problem-Acc 
‘Paul solved the problem.’ 


b. Pal | nem old-ott-a meg a problémá-t. 
Paul not solve-PST-DEF.38G PFV the | problem-Acc 
‘Paul didn’t solve the problem.’ 


Is this a violation of lexical integrity? Not necessarily. Like the forms in (9.24), 
by all indications megoldotta is generated by morphological rules, but some 
formal approaches posit that these kinds of periphrastic constructions are 
then inserted into syntax at more than one node (i.e. meg and oldotta occupy 
separate syntactic nodes). Moreover, it has been suggested that the relevant 
notion of a word is ‘terminal syntactic node’, rather than ‘morphologically 
generated object’ (Ackerman and LeSourd 1997). Thus, under these 
assumptions, lexical integrity is not violated. The point here is that it can be 
quite difficult to determine what constitutes a true violation of the Lexical 
Integrity Hypothesis. 

Nonetheless, some kinds of data are widely viewed as problematic. For 
instance, all syntactic models treat agreement and movement (or some 
equivalent of movement) as types of syntactic rules, so we should not expect 
to find agreement within a word, or between an internal part of a word and 
another element inthe sentence. Likewise, we should not encounter instances 
of movement within a word, or syntactic placement of an element within a 
word. No conclusive examples of word-internal syntactic movement have 
been documented. (Some approaches to morphological description posit 
word-internal movement, but for theory-internal reasons, not empirical 
ones.) However, rare instances of the other three possibilities can be found. 

For example, in some languages verbs show agreement with incorporated 
nouns (i.e. with compound members), as is the case in Southern Tiwa: 


(9.26) a. ti-khwian-mu-ban 
18G.SBJ / SG.OBJ-dog-see-PsT 
‘I saw the dog.” 


5 — Note that this example seems to be an exception to the generalization that incorporated 
nouns have generic interpretation (see (9.1) and (9.2)). The reason for this is unclear. 
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b. bi-khwian-mu-ban 
1SG.SBJ / PL.OBJ-dog-see-PsT 
‘I saw dogs.’ 
(Allen et al. 1990: 322) 


Here the agreement prefixes register the number of the incorporated noun 
-khwian- 'dog(s)', so itis hard to escape the conclusion that there is agreement 
between the internal parts of the complex verb. Also, in Chapter 5 we 
encountered the following Upper Sorbian example of agreement between 
an internal part of a word and another word in the phrase (repeated from 
(5.13)). Here, the adjective mojeho agrees for gender with the root of the 
denominal adjective muZowa. (Muž ‘husband’ is a masculine noun.) 


(9.27) moj-eho muz-ow-a sotra 
my-M.SG.GEN husband-Poss-F.sG.NOM sister.F.SG. NOM 
^my husband's sister' 

(Corbett 1987: 303) 


Equally interesting are examples of clitics within words. In Udi, the clitic 
ne sometimes occurs between the two parts of a bimorphemic verb stem 
(see (9.28a)). 


(9.28) a. nana-n buya-ne-b-e pa acik'alsey 
mother-ERG  find-3sc-do-AoRIl| two toy.ABS 
‘Mother found two toys.’ 
b. nana-n te-ne buya-b-e pa acik'aliey 
mother-ERG — NEG-3SG  find-do-AoRIl two toy.ABS 


‘Mother did not find two toys.’ 
(Harris 2002: 117, 123) 


Buya-b-e seems to be a single word: coordination ellipsis, movement and 
anaphoric replacement of the internal constituents are all impossible. Yet 
according to properties such as host selection, ne is a clitic whose position 
is governed by principles of syntax, not an affix (compare (9.28a) with 
(9.28b)). So it seems that the syntax places the clitic internally to the verb. 
Moreover, to do so, the rules of syntax must know where the morpheme 
boundaries are, and therefore must refer to the verb's internal morphemic 
structure (Harris 2000). This is problematic for the idea of lexical integrity. 
In the end, the Lexical Integrity Hypothesis remains controversial. The 
most extreme syntactic approaches to morphological structure reject it 
entirely, partly on empirical grounds, but also for theory-internal reasons. (A 
theory without a principled distinction between words and phrases entirely 
cannotin any meaningful way even define a condition like (9.23).) Still, some 
linguists consider counterexamples to be minor, and many approaches to 
formal description accept some version of the Lexical Integrity Hypothesis. 
Finally, if we believe that the Lexical Integrity Hypothesis can be upheld on 
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empirical grounds, we may ask about its underlying causes. One possibility 
is that (9.23) is an axiomatic principle of Universal Grammar — a fundamental 
and irreducible constraint on the nature of language. Or alternatively, 
lexical integrity effects might indirectly result from independent syntactic 
constraints on movement, binding of anaphoric pronouns, and so on. But 
most commonly, lexical integrity is seen as following from the architecture of 
grammar. If morphology and syntax are two completely separate components 
of grammar, it can be expected that syntactic rules do not ‘look inside’ the 
complete words received from the morphology. In particular, lexical integrity 
effects have been used to motivate a language architecture in which syntactic 
rules apply to bundles of morphosyntactic features, and the morphological 
component delivers complete words only after all syntactic rules have applied. 
(See Section 5.5.2 for more discussion of this proposed architecture.) This 
would explain why compound members cannot be replaced by anaphoric 
pronouns, why they cannot be extracted or undergo coordination ellipsis, etc. 
The syntax simply has no access to word-internal structure because words 
are inserted post-syntactically. 

There are a large number of ways in which syntactic rules can be 
formulated, and a large number of ways in which the relevant notion of 
^word' can be defined. It will therefore inevitably be difficult to reach a 
consensus about the exact relation between syntax and morphology. Clearly, 
morphology and syntax are different to some degree, but the nature of the 
relationship will probably be debated for a long time to come. 


Summary of Chapter 9 


There are two main difficulties that we encounter in dividing texts 
into word-forms: distinguishing affixed word-forms from phrases that 
contain a function word, and distinguishing compounds from phrases 
with two content words. Word-forms that are intermediate between fully 
independent word-forms and fully dependent affixes are called clitics, 
and clitics and affixes are grouped together as bound forms. Free forms 
differ from bound forms in that they are prosodically independent, 
cleftable, topicalizable and coordinatable. Clitics differ from affixes in 
that they have greater freedom of host selection, are phonologically less 
integrated, do not trigger or undergo morphophonological alternations, 
show no idiosyncrasies of meaning or distribution, and may have more 
freedom of movement. Phrases differ from compounds in that they 
allow referential dependent members and exhibit less phonological, 
morphological and syntactic cohesion. 

Often a ‘Lexical Integrity Hypothesis’ is postulated that forbids 
syntactic rules to apply to parts of words. While (apparent) 
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counterexamples to lexical integrity can be found, linguists have 
interpreted these exceptions differently, Among those formal 
approaches that accept the Lexical Integrity Hypothesis, the effects 
are usually seen as falling out from the architecture of the grammatical 
system — separate morphological and syntactic components. 


Further reading 


The literature on clitics is strikingly large. First and foremost, much 
influential work regarding the affix /clitic distinctionis due to Arnold Zwicky 
(e.g. Zwicky 1977, 1985b and Zwicky and Pullum 1983). See also Klavans 
(1985), Kanerva (1987), Halpern (1995), Bo&kovic (2001), Aikhenvald (2002) 
and Anderson (2005). Some famous examples of special clitics come from 
the Slavic languages; see Franks and King (2000) for descriptions. Nevis 
et al. (1994) is a bibliography of clitic research prior to 1991. More recent 
collections of papers include Beukema and den Dikken (2000) and Gerlach 
and Grijzenhout (2000). 

For compounds versus phrases, see Bauer (1998), Smirniotopoulos and 
Joseph (1998) and Bisetto and Scalise (1999). 

Lexical integrity is discussed and defended in Wasow (1977), Lapointe 
(1980), Di Sciullo and Williams (1987), Bresnan and Mchombo (1995), and 
in a modified form, in Ackerman and LeSourd (1997). Also see Rosen (1989) 
and Mohanan (1995) for arguments specifically related to incorporation. 
For counterarguments to the Lexical Integrity Principle, see Lieber (1992), 
Harris (2000), and Booij (2009). Syntactic approaches that do not assume 
lexical integrity include Baker (1988), Sadock (1991) and Halle and Marantz 
(1993). 


Comprehension exercises 


1. At the beginning of Section 9.1, we asked whether backboard, backdoor 
and back seat are compounds or phrases. Develop an answer to this 
question, and justify it using tests introduced in this chapter. 


2. Provide arguments to show that English -s, the suffix of the third person 
singular of present-tense verbs, is an affix, not a clitic. 


3. Whatis wrong with the following sentences? 


a. Polish 
*Go spotka-l-em w Krakowie. 
him . meet-Pesr-319G in Cracow 


'| met him in Cracow.’ 
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b. French 
*A: Qui joue? Robert? - B: Non, Tu-joues. 
who plays Robert no you-play 
‘A: Who is playing? Robert? — B: No, vou are playing.’ 
c. Serbian 
*Klaru Covek voleo-je. 


Klara.acc man.NoM  loved-Aux 
‘The man has loved Klara.’ 


d. Ponapean 
y keng-wini-o-la. 
19G | eat-medicine-DEM-COMPL 
‘I completed my taking of that medicine.’ 


4. Sometimes the various criteria for distinguishing clitics from affixes 
contradict each other. For instance, in Spanish the bound pronominals 
undergo a morphophonological alternation when a third-person dative 
pronominal co-occurs with an accusative pronominal: -le is replaced by 
-se because another ! follows: 


díga-me 'tell me' 

díga-le ‘tell him’ 

diga-me-lo ‘tell me it’ 

diga-se-lo ‘tellhim it’ — (*diga-le-lo) 

Given what we said in this chapter about Spanish bound pronominals, 
where is the contradiction? 


5. Another case of a contradiction comes from Lithuanian, which forms 
reflexive verbs by means of an element s(i). (The letter é stands for a 


long [e:].) 


‘rock’ ‘rock oneself’ ‘not rock oneself’ 
ISG supu supuosi nesisupu 
2SG supi supiesi nesisupi 
3 supa supasi nesisupa 
IPL supame supames nesisupame 
2PL supate supatés nesisupate 


In what ways is this element like an affix, and in what way is it like a 
clitic? 


6. Look at the example of noun incorporation in Guaraní (ex. (11.26)). 
Which criteria can be applied to show that (11.26b) contains a compound, 
not a phrase like (11.26a)? 
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Exploratory exercise 


As we saw in this chapter, some linguists have posited that a word boundary 
operates as a kind of border below which syntactic rules cannot apply. 
However, it is not always clear what constitutes evidence for or against this 
hypothesis. For instance, in Germanic languages such as English, German 
and Dutch, entire phrases can act as compound members, e.g. a down in the 
trenches attitude, a floor of the birdcage taste, or just rolled out of bed hair. We will 
call these phrasal compounds. The question we want to ask is: Do phrasal 
compounds violate lexical integrity? 

Some researchers have suggested that the dependent member of the 
compound (i.e. the phrase) must be a set phrase or idiom — something 
that is stored in the lexicon as a single unit. If so, phrasal compounds are 
probably not true violations of lexical integrity. If the embedded phrase 
is stored in the lexicon similarly to a word, this may make it available to 
the word-formation component without the operation of syntactic rules. 
In this exercise you will test the claim that the dependent member of such 
compounds must be a set phrase or idiom, and explore implications that 
the results have for the Lexical Integrity Hypothesis. 

Many languages do not allow phrasal compounds, so the language of 
study should be one for which you already know that these compounds are 
possible to some extent. English is used here for demonstration purposes. 
The methodology involves observation of naturally-occurring examples. 


Instructions 


Step 1: Develop criteria for determining that a construction is (or 
is not) a phrasal compound. For instance, down in the trenches attitude is 
clearly a phrasal compound because ‘normal’ compounds cannot include 
prepositions or determiners. Moreover, the structure must be [[down in the 
trenches] attitude]. But what about gold and jewellery merchants or severe weather 
warning? Should the first be considered a phrasal compound (presumably 
with the structure [[gold and jewellery] merchants]), or conjoined phrases with 
ellipsis (derived from [[gold merchants] and [jewellery merchants]])? Is severe 
weather a phrase, or itself a compound inside the larger compound? 

Consider many kinds of borderline cases and develop specific criteria 
for including or excluding them from the category of phrasal compounds. 
Explain your reasoning. You might find it helpful to review the rules of 
compounding for your chosen language of study. Useful descriptions of 
English compounding patterns include Marchand (1969), Bauer (1983), 
Bauer and Renouf (2001) and Plag (2003). 

Step 2: Choose a newspaper or magazine to gather data from. Scan the 
text for phrasal compounds, and record any that are found. Continue until 
you have recorded at least 10 examples, preferably more. If working with 
others (e.g. in a class setting), a divide-and-conquer approach might be 
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useful? Have each person gather 10 examples from different sources and 
then combine the results. 

Step 3: Analyze the data. Remember that the starting point for this project 
is the claim that in phrasal compounds, the phrase must be likely to be 
directly stored in the lexicon rather than productively generated. Decide 
whether the dependent members of your phrasal compounds fit this 
description. Are they freely formed syntactic constructions, collocations, 
set phrases, idioms ...? 

(A collocation consists of two or more words that tend to occur together. 
For instance, weather tends to occur together with severe, but less often with 
harsh. Thus, severe weather is a collocation, but harsh weather is not. Linguists 
often measure the collocation strength of two words as the likelihood of 
two words occurring together, compared with the likelihood of each 
word occurring independently. Collocation strength can thus formally be 
expressed as a probability. However, for the purposes of this exercise, it 
is sufficient to use your own intuitions about the degree to which some 
construction constitutes a collocation or set phrase in the language.) 

Step 4: Consider the implications of your data for the Lexical Integrity 
Hypothesis. Are your data consistent with this hypothesis, or do they seem 
to contradict it? Explain your reasoning. 

Step 5 (optional): Read some of the literature about phrasal compounds 
and lexical integrity. Work in this area includes Botha (1981), Lieber (1992), 
Wiese (1996), Ackema and Neeleman (2004), Carstairs-McCarthy (2005) and 
Lieber and Scalise (2007). Do your observations match the description of 
phrasal compounds in the literature? Based on what you have read, why 
is it difficult to determine whether phrasal compounds represent true 
violations of lexical integrity? Does discussion in the literature make you 
look at your data in a new light? If so, explain. 


? The attentive reader has noticed that [[divide-and-conquer] approach] is a phrasal compound! 


Morphophonology 


E Chapter 2, we saw that morphemes often have different phonological 
shapes depending on the environment, i.e. the other morphemes and 
sounds with which they co-occur in a word. For example, the stem of the 
English lexeme leaf is pronounced [lif] in the singular, but [li:v] in the plural 
(leaves); the stem of pat is always pronounced [pæt] if it occurs without any 
suffix, butin many varieties the pronunciation is [per] ifa vowel-initial suffix 
follows (patting [pzerin]). The forms [peet] and [paer] (and [lif] and [li:v]) are 
phonological allomorphs — they bear the same meaning and have quite 
similar phonological shape (in contrast to suppletive allomorphs, which 
are not phonologically similar). Phonological allomorphs are interesting 
because they represent the point of intersection between morphological 
and phonological structure. In this chapter we explore issues related to the 
morphology-phonology interface in some detail. 


10.1 Two types of alternations 


The formal relation between phonological allomorphs is called an 
alternation. Alternations come in two kinds: automatic alternations 
and morphophonological alternations. Like morphological patterns, 
alternations are often described in process terms. For example, German 
has a phonemic distinction between /k/ and /g/, and in positions before 
a vowel, both consonants are possible (e.g. Kinder [k] ‘children’, Geld [g] 
‘money’, Vólker [k] ‘peoples’, Tage [g] ‘days’). But in syllable-final position, 
both /k/ and /g/ are pronounced as [k] (e.g. Volk [volk] ‘people’, Tag 
[ta:k] ‘day’). This alternation of voiced and voiceless obstruents is thus 
called Final Devoicing (a process term) because /g/ seems to 'lose' its 
voicing feature in syllable-final position. As always when process terms 
are used, this terminology is probably best understood as metaphorical — 
speakers' own knowledge may not include a literal process of devoicing. 
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But the process terminology is very convenient because it gives more 
information than purely static terminology. If we simply referred to the 
German alternation in [ta:k]/ [ta:go] as voiceless/voiced alternation, we would 
not know that voiceless obstruents in syllable-final position do not always 
participate in this alternation — e.g. [k] in both Volk and, crucially, Volker. If, 
on the other hand, we call the alternation devoicing, it is immediately clear 
that the existence of [k] in both syllable-initial and syllable-final positions is 
completely expected, but [g] in syllable-final position is not. 

Let us look at a few representative cases of both automatic and 
morphophonological alternations, focusing on examples of stem allomorphy. 


(10.1) Some automatic alternations! 
a. German Final Devoicing: Voiced obstruents are pronounced 
voiceless when they occur in syllable-final position. 


Tage [ta:go] 'days' Tag [ta:k] 'day' 
Liese [li:zo] ‘Liese (name)’ Lieschen [li:scan] ‘little Liese’ 
Monde [mo:ndo] ‘moons’ Mond [mo:nt] ‘moon’ 


b. English Flapping: In many varieties of English, alveolar plosives 
([d] and [t]) are pronounced as voiced flaps [r] when they occur 
after a vowel and in front of an unstressed vowel. 


pat [peet] patting [peerin] 
fat [feet] fatter [feeror] 
pad [pæd] padding [peerty] 


c. Russian Akanie (neutralization of unstressed o and a): The vowel 
0 is pronounced [a] when it occurs in the syllable immediately 
before the stressed syllable, and both o and a are pronounced [9] 
when they occur in an earlier syllable, or in a syllable after the 


stressed syllable. 

vol [vol] ‘ox (NOM.scG)' vol-y [va'li] ‘oxen (NOM.PL)’ 

nós-it [‘nos‘it] ‘carries’ nos-i [na's'i] ‘carry!’ (IMPV) 

bórod-y [‘boradi] ‘beards’ borod-á ‘beard’ 
[bera'da] 

bandit [ban’d‘it] ‘gangster’ ^ bandit-izm ‘gangsterism’ 
[band ‘i’t’izm] 


d. Japanese Palatalization: Alveolar obstruents ([t] and [s]) are 
pronounced as palatals ([te] and [c], commonly written as ch and 
sh) when they occur before the high palatal vowel [i]. 
kas-e ‘lend’ (imperative) kash-i ‘lend’ (continuative) 
kat-e ‘win’ (imperative) kach-i ‘win’ (continuative) 

(Vance 1987: 177) 


1 The names of alternations are capitalized here because they are often traditional names that 
are in general use among linguists (e.g. German Umlaut, Russian Akanie, Japanese Rendaku). 
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(10.2) Some morphophonological alternations 


a. 


English Trisyllabic Shortening: The vowels or diphthongs [ei], 
[i:], [ai] and [ou] alternate with the short vowels [æ], [e], [1] and 
[0] when followed by two syllables, the first of which is unstressed. 
[ei] nation [æ] national 

[i: extreme — [e] extremity 

[ai] divine [1] divinity 

[ou] globe [o] globular 


German Umlaut (vowel fronting): The back vowels and 
diphthongs a, o, u and au alternate with front vowels 4 [e,z], 6 
[ø, œ], ü [y, Y] and Zu [ov] in certain morphological environments 
(plural of nouns, past subjunctive of verbs, female-noun suffix 


-in). 

Buch ‘book’ Bücher ‘books’ 

Vater ‘father’ Väter ‘fathers’ (cf. (3.1)) 
bot ‘offered’ bote ‘would offer’ 
Jude ‘Jewish person/man' Jiidin ‘Jewish woman’ 


Russian Zero Alternation: The vowels o/e in the last syllable of the 
stem sometimes alternate with zero when a vowel-initial suffix 
follows. 

zamok ‘castle (NOM)’ zámk-i ‘castles (NoM)’ (cf. 2.12b) 
ókon ‘windows (GEN)  okn-ó ^window (NOM)' 
zemél'-nyj ‘relating to land’ zemlj-4 ^ ‘land’ 

ogrómen ‘huge (predicative) ogrómn-yj ‘huge (attributive) 


. Hebrew Spirantization (fricativization): The stops p, b, k alternate 


with the fricatives f, v, x when a vowel precedes. 


yi-spor ‘he willcount’  sofer ‘he counts’ 
kotev  'he writes’ yi-xtov ‘he will write’ 
pilpel ‘he peppered’ me-falpel ‘he peppers’ 
bakasa ‘request’ be-vakasa ‘please’ 


Turkish k/g alternation: The consonant k alternates with ¢ when 
a vowel follows. (In standard Turkish, the letter 4 is no longer 
pronounced, so yatag-1 is [jatat], but some non-standard varieties 
preserve a velar fricative.) 


inek ‘cow’ ineg-i ‘his cow’ 
kuyruk ‘tail’ kuyrug-u ^ 'its tail’ 
kópük 'foam" kópiüg-ü ‘its foam’ 


yatak ‘bed’ yatag-1 ‘its bed’ 


Japanese Rendaku (sequential voicing): Morpheme-initial 
voiceless obstruents alternate with voiced obstruents when a 
vowel precedes, mostly when they occur initially in a second 
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compound member. (Note that [b] functions as the voiced 
equivalent of [h].) 


kami ^ ‘paper’ (iro  'color?) iro-gami ‘colored paper’ 
tooroo ‘lantern’ (ishi ‘stone’) ishi-dooroo ‘stone lantern’ 
shirushi ‘mark’ (hoshi ‘star’) hoshi-jirushi ‘asterisk’ 

hone ‘bone’ (se  ‘back’) se-bone ‘backbone’ 

chi ‘blood’ (hana ‘nose’) hana-ji ‘nosebleed’ 


(Vance 1987: ch. 10) 


So what makes automatic alternations different from morphophonological 
alternations? In other words, why do linguists recognize two distinct types? 
The following empirical characteristics can be used to distinguish between 
the two. 

(i) Phonological versus morphological/lexical conditioning. In automatic 
alternations, the conditions under which the alternations occur can always 
be described in purely phonological terms. In morphophonological 
alternations, by contrast, the conditions always have a morphological 
(and sometimes also lexical) component. For example, English Trisyllabic 
Shortening is restricted to certain suffixes (e.g. globular versus globalize) and 
to certain words (e.g. national exhibits Trisyllabic Shortening, but notional 
does not; the latter is pronounced [noo/[nl], not [nofn]]). Also, Hebrew 
has many words where k does not undergo Spirantization although the 
phonological condition, a preceding vowel, is met (e.g. kocer ‘reaps’, yikcor 
‘will reap’). 

In the extreme case, a morphophonological alternation occurs under 
purely morphological and lexical conditions. This can be illustrated by 
German Umlaut. This alternation was originally motivated by assimilation 
to a high front vowel in the following syllable (e.g. Jude/Jiidin), but in most 
contemporary words, this original front vowel has been lost completely or 
reduced to schwa, as shown in the examples in (10.3). 


(10.3) Old High German Modern German 
apful/epfili Apfel/Apfel [epfl] ‘apple(s)’ 
(complete loss of final [i]) 
kalb/kelbir Kalb/Kalber [kelbor] ‘calf/calves’ 
(reduction to schwa) 


Thus, nowadays this original phonological condition is irrelevant. Umlaut 
occurs together with certain suffixes such as plural -er (e.g. Buch/Biicher 
‘book(s)’). With other suffixes, it occurs only subject to further lexical 
conditions. With the plural suffix -e, the application of Umlaut has to be 
learned individually for each lexeme (e.g. Hund/Hunde ‘dog(s)’ versus 
Bund/Biinde ‘league(s)’). 

Automatic alternations are the synchronic consequence of phonetically- 
motivated diachronic sound changes. Sound change is motivated by 
phonetics in the sense that it occurs because phonetic production is made 
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easier by the change. For example, pronouncing an alveolar or velar 
consonant before [i] is relatively more difficult than pronouncing a palatal 
(or palatalized) consonant, and this explains why the diachronic change of 
palatalization before front vowels is so common in the world's languages 
(e.g. (10.1d)). Final devoicing helps pronunciation because maintaining the 
vibration of the vocal chords (which is made difficult by the oral obstruction of 
obstruents anyway) is particularly difficult in the final position (e.g. (10.1a)). 
Neutralization of unstressed vowels occurs for perceptual reasons: when a 
vowel is not stressed, it is less loud and thus differences between vowels are 
harder to perceive (e.g. (10.1c)). As in German Umlaut, morphophonological 
alternations often result when the phonetic motivation for some 
automatic alternation is subsequently obscured. 

(ii) Phonetic coherence. Often a whole range of different sounds is 
affected in a similar way by a sound change, so automatic alternations are 
more likely to be phonetically coherent in the sense that both the affected 
sounds and their replacements are natural classes. For example, in Old 
High German (c. 800-1100 CE), voiced obstruents did occur in syllable-final 
position, but around 1100 a sound change occurred by which all syllable- 
final obstruents became voiceless. As a result, in German, the synchronic 
rule of Final Devoicing affects all voiced obstruents and turns them into the 
corresponding voiceless obstruents. Similarly, English Flapping affects all 
alveolar plosives. 

In morphophonological alternations, by contrast, the coherence of the set 
of affected sounds may have been lost by subsequent changes. Thus, the 
class of English vowels affected by Trisyllabic Shortening is not a natural 
class; the class of Hebrew consonants affected by Spirantization is not a 
natural class; and the vowels resulting from German Umlaut are not a 
natural class (in particular, äu [dy], the umlauted counterpart of au [au], can 
be described as 'fronted' only with great difficulty). 

(iii) Phonetic distance. Moreover, in automatic alternations, the alternating 
sounds tend to differ in one feature only, but in morphophonological 
alternations they may differ quite drastically. For instance, English [i:]/[e], 
[oy]/[p], Turkish [k]/@ and Japanese h/b show a wide phonetic distance 
because the sound changes that originally created the alternations occurred 
a long time ago and subsequent changes have made the connections 
opaque. (For instance, Japanese Rendaku originally led to p/b alternations, 
comparable to k/g and t/d alternations, but later p became h.) 

(iv) Application in derived environments only. Automatic alternations 
result from constraints on pronunciation that are valid for all environments, 
and an alternation is just a special case that arises when different 
morphological contexts provide different phonological conditions. For 
instance, syllable-final obstruents are always voiceless in German, Russian 
[o] can never occur in an unstressed syllable and Japanese never allows [t] 
and [s] in front of [i] (Vance 1987: 21). Morphophonological alternations, by 
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contrast, may be restricted to derived environments. For instance, Turkish 
[k] is deleted between vowels in a derived environment (see (10.2d)), but 
inside a morpheme there is nothing wrong with intervocalic [k] (e.g. sokak 
‘street’, sitreptokok ‘streptococcus’). In Hebrew, [b] is spirantized to [v] 
after a vowel, but inside a morpheme there is no problem with [b] (e.g. 
kibuc ‘gathering; kibbutz’). English long vowels and diphthongs may 
get shortened when a two-syllable suffix follows (e.g. divine — divinity), 
but inside a morpheme there is nothing wrong with a diphthong in the 
antepenultimate syllable (e.g. vitamin ['vartomin |). 

(v) Application to loanwords. Automatic processes generally apply to 
loanwords and foreign names as they do to native words. Thus, the city 
Madrid is pronounced with a final [t] in German because of final devoicing; 
in Russian, not only Moskoá is pronounced with [a] where the spelling has 
o, but also Mombasa and Montana (and in Mogadíso, the pronunciation is 
[9], because 0 is not immediately before the stressed syllable). In Japanese, 
loans from English have chi and shi for English [ti] and [si] (e.g. shiisoo from 
seesaw, shiizun from season). By contrast, the effects of morphophonological 
alternations need not be found in loanwords. Thus, Turkish loanwords 
sometimes preserve their final [k] (e.g. sitreptokok ‘streptococcus’, 
sitreptokoku), and Russian zero alternation is never applied in loanwords 
(e.g. baron/barony ‘baron’, not *barny). 

(vi) Speech style and obligatoriness. Automatic alternations may 
still be optional and sensitive to the speech style. Often, for instance, in 
formal, slow speech the process is less likely to occur than in informal, 
fast speech. For instance, English Flapping may be suppressed in formal 
speech. Morphophonological alternations are never sensitive to the speech 
style. It should be noted, however, that most automatic alternations that are 
described in grammars are obligatory as well. Thus, obligatoriness is not 
a good diagnostic of a morphophonological alternation, but sensitivity to 
speech style or optionality indicates an automatic process. 

(vii) New segments. Automatic alternations sometimes create segments 
that are not found under other conditions. For instance, English [r] only 
occurs under the conditions of Flapping, and Russian [9] occurs only under 
the conditions of Akanie. By contrast, morphophonological alternations 
tend to lead to segments that occur independently in the language. Thus, 
German has front vowels like 6 and ii in basic words that have nothing to do 
with Umlaut (e.g. dde ‘bleak’, Miihle ‘mill’), and Hebrew has the fricatives 
f, v and x in basic words that have nothing to do with Spirantization (e.g. 
finjan ‘coffee cup’, SaxSav ‘now’). (This property of morphophonological 
alternations is also called structure preservation.) 

(viii) Application across word boundaries. Automatic alternations may 
apply across word boundaries. Thus, Flapping occurs in English also within 
phrases, as in a lot of stuff [o lor ov staf]. This is not generally possible with 
morphophonological alternations. 


10.2 THE PRODUCTIVITY OF MORPHOPHONOLOGICAL ALTERNATIONS 217 


These differences are summarized in Table 10.1. 


Automatic alternations 


Morphophonological alternations 


only phonologically conditioned 
phonetically coherent 
alternants are phonetically close 


not contradicted by simple 
morphemes 


extend to loanwords 


may be optional and sensitive to 
speech style 


can create new segments 


not necessarily restricted to the 
word level 


at least partly morphologically or 
lexically conditioned 


not necessarily phonetically 
coherent 


alternants may be phonetically 
distant 


may be restricted to derived 
environments 


need not extend to loanwords 


not sensitive to speech style 


do not generally lead to new 
segments 


generally restricted to the word 
level 


Table 10.1 Two types of alternations 


These empirical criteria divide alternations into two types. The 
significance of this distinction for the formal architecture of the language 
system is addressed in Sections 10.4 and 10.5. But first we consider more 
properties of morphophonological alternations. 


10.2 The productivity of morphophonological 
alternations 


When we look at morphophonological alternations in greater detail, we see 
that these show quite a bit of internal diversity. For the sake of convenience, 
we can distinguish three different classes, although they are probably just 
three points on a continuum: relic alternations, common alternations and 
productive alternations. 

Relic alternations are found only in a few words, and it is therefore 
doubtful whether a rule should be formulated for them. An example is the 
s/r alternation in German. This was quite regular in Old High German: in 
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vowel-changing verbs, the past-tense plural forms and the past participle 
showed r, whereas the other forms showed s: 


(10.4) PRESENT PAST TENSE PAST TENSE PAST 


TENSE SINGULAR PLURAL PARTICIPLE 

lesan las larum gileran ‘read’ 
ginesan ginas ginarum gineran ‘be saved’ 
kiusan kos kurum gikoran ‘choose’ 
friusan __ fros frurum gifroran ‘freeze’ 


In Modern German, most of these alternations have been levelled: the 
modern forms are lesen/las/lasen/gelesen, genesen/genas/genasen/genesen and 
frieren/fror/froren/gefroren. However, in the high-frequency lexeme BE, the 
alternation was partially preserved (war/gewesen). And, when we take 
derived lexemes into account, we also see it in Frost/frieren ‘frost /freeze’. In 
these cases it really takes a historical linguist to discover anything systematic 
about these alternations. For contemporary speakers, the relation between 
war ‘was’ and gewesen ‘been’ is probably as suppletive and non-systematic 
as the relation between bin ‘am’ and war ‘was’. 

Common alternations are found in many words in a language, and often 
in different morphological contexts. An example is the Diphthongization 
alternation in Spanish, whereby ue and ie occur in stressed syllables, and o 
and e occur in unstressed syllables: 


(10.5) ciérro ‘I close’ cerrar ‘to close’ 
cuénto ^] tell’ contár “to tell’ 
buéno ‘good’ bondád ‘goodness’ 
cuérpo ‘body’ corpóreo ‘bodily’ 


Spanish has dozens of verbs such as cerrar and contar that show this 
alternation, and there are many derivational relationships such as bueno/ 
bondad where itshows up as well. So at leastas linguists we want to formulate 
a rule rather than just say that all these cases show (weak) suppletion. And 
it would seem reasonable to assume that speakers, too, have some kind 
of rule. However, this is difficult to show, because the Diphthongization 
alternation is not productive. When a stem with a diphthong becomes the 
stem of a novel verb (e.g. a verb formed by the denominal pattern des-N- 
ar ‘remove N’), the diphthong appears throughout the paradigm (as in 
deshuesár ‘remove bones’ from huéso ‘bone’, not *deshosár). When a stem 
with a monophthong appears in a novel verb, it shows no alternation (e.g. 
filosofár ‘philosophize’, which has stem-stressed forms such as filosófo ‘I 
philosophize’, not *filosuéfo). Similarly, when a diminutive in -ito is formed 
from a noun with a diphthong, the diphthong is preserved (e.g. cuerpíto 
‘little body’, not *corpíto, from cuérpo ‘body’). 

Productive alternations are not merely found in many words, but are 
also extended to new words such as neologisms and borrowings. German 
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Umlaut is a famous example of such an alternation. In older German, it was 
productively extended to new plurals such as Mutter/Mütter ‘mother(s)’, 
Garten/Gürten ‘garden(s)’, which did not have Umlaut in the plural originally 
because their old plural suffix (now lost) did not contain an [i]. However, in 
modern German the Umlaut is no longer productive in plurals, and neither 
is itin female-noun formations of the type Jude/Jiidin (a newly formed female 
noun from Luchs ‘lynx’ would have to be Luchsin, not *Lüchsin). But there is 
one pattern in which the Umlaut is required: diminutives in -chen and -lein. 
For instance, one could form a diminutive Füxchen from the new word Fax 
‘fax’, and parents might refer to a medicine called Vitamnol as Vitamnólchen 
when talking to a small child. German Umlaut thus demonstrates clearly 
that a morphophonological alternation may be productive. 

Some other productive morphophonological alternations are: 

(i) Turkish k / ð. This is extended to some loanwords, e.g. kartotek/kartotegi 
‘card catalog’, frikik/frikigi ‘free kick’, barok/barogu ‘baroque’. However, other 
loanwords preserve k (see the example sitreptokok/sitreptokoku in Section 
10.1). 

(ii) Polish Second Palatalization. This process changes the velars k, g and 
ch [x] to c [ts], dz [dz] and sz [J] in certain environments - e.g. in the locative 
singular of nouns of the a-declension (e.g. mucha ‘fly’, locative musze; stuga 
‘servant’, locative studze; matka ‘mother’, locative matce). This alternation is 
completely productive, and it always applies to loanwords - e.g. Braga (city 
in Portugal), locative Bradze; alpaka ‘alpaca’, locative alpace, and so on. 

(iii) Indonesian Nasal Substitution. In this alternation, the initial voiceless 
stop ofa verb root is replaced by a nasal stop at the same place of articulation 
when the active-voice prefix meng- is attached to the root. In addition to t, k 
and p, this alternation also affects s, where the replacing nasal is ny [n]. (The 
letters ng stand for [n].) 


(10.6) meng + urus mengurus ‘take care’ 


meng + tulis menulis ‘write’ 
meng + kirim mengirim ‘send’ 
meng + pakai memakai ‘use’ 


meng + sewa menyewa ‘rent’ 


That this alternation is productive can again be seen in the behaviour of 
loanwords, which are also subject to Nasal Substitution: 


(10.7) meng + kritik mengritik ‘criticize’ 
meng +  sukses+kan menyukseskan ‘make successful’ 
meng + protes memrotes ‘protest’ 


(Sneddon 1996: 9-13) 


However, in recent borrowings the initial consonant tends to be retained, 
and, besides the forms in (10.7), the forms mengkritik, mensukseskan and 
memprotes are possible as well. This perhaps indicates that the alternation is 
losing its productivity. 
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These properties of relic alternations, common alternations and 
productive alternations are summarized in Table 10.2. 


Relic alternations Common alternations Productive alternations 


apply to very few apply to many items apply to many items 
items 


do not apply to novel do not apply to novel apply to novel words 
words words 


probably not probably recognized clearly recognized by 
recognized by by speakers speakers 
speakers 


Table 10.2 Three types of morphophonological alternations 


The different degrees of productivity exhibited by morphophonological 
alternations make them quite different from automatic alternations, which 
always apply to new words. 


10.3 The diachrony of morphophonological alternations 


We have seen that synchronic alternations have their origin in historical 
sound changes, but we have not yet explained why these sometimes result in 
automatic alternations and sometimes in morphophonological alternations. 
On one level, the answer is straightforward: sound changes always yield 
automatic alternations initially, and automatic alternations then become 
morphophonological alternations in a further step of development: 


(10.8) sound change — automatic alternation — morphophonological 
alternation 


Automatic alternations arise because phonetics and phonology are to 
a large extent autonomous from morphology, or, to put it in even more 
metaphorical terms, they act blindly, without seeing the consequences of 
their actions for morphology. If sound changes could, so to speak, predict 
the outcome of their actions and cared about morphology, they might 
exercise some restraint. For example, Hebrew Spirantization (see (10.2d)), 
which turned intervocalic [p] into [f], could have changed non-alternating 
words like safa ‘lip’ (from earlier sapa), and it could have spared verbs 
like soper/yispor, which became alternating (sofer/yispor) as a result of the 
sound change. The fact that this does not happen shows that phonetics 
and phonology mostly mind their own business and operate without 
consideration of morphological structure. 
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But this does not explain yet why automatic alternations change 
into morphophonological alternations. For instance, in early Old High 
German, the Umlaut must have been an automatic alternation, so that, for 
instance, Jiidin ‘Jewish woman’ (derived from Jude ‘Jew’) could not have 
been pronounced otherwise because a back vowel had to be assimilated 
to a front vowel in the next syllable. But subsequently the phonological 
restriction that made u-i sequences unpronounceable was lost, and already 
in Middle High German words like Luchsin ‘female lynx’ were no problem. 
But why was the alternation retained? Why did Jiidin not revert to its earlier 
pronunciation Judin? The reason is apparently that speakers had already 
begun treating the front vowel ii as a property of the word. If Old High 
German speakers actively derived the word Jüdin each time it was produced 
by applying the Umlauting rule to an underlying representation [ju:din] 
(producing the surface representation [jü:din]), then we might expect that 
loss of the rule would cause the pronunciation of Jüdin to revert to Judin. But 
this did not happen, and such things do not happen in general. However, 
perhaps speakers stored the word in the lexicon as the surface form which 
they heard, or maybe they stored the set of stem allomorphs ([jud-] and 
[jüd-]), along with a generalization about their distribution. If so, there is 
no reason that we should expect the effects of automatic alternations to 
disappear along with the phonological rule. 

The loss of the phonological restriction thus revealed that speakers had 
reanalyzed the alternation. Speakers reinterpreted the automatic alternation 
as signalling (or co-signalling) a particular morphological pattern. An 
alternation that originally was purely phonologically conditioned came to 
be morphologically conditioned. This reinterpretation was hidden as long 
as the phonological rule was in effect, and became observable only upon 
further change in the language. 

Another example of an automatic alternation being reanalyzed as a 
morphophonological alternation comes from Zulu Labial Palatalization in 
passive verb forms. In Zulu, the passive voice is marked by a suffix -w(a), 
as illustrated in (10.9a). In (10.9b), we see the effects of Labial Palatalization. 
(Note that orthographic j = [d3], sh = [f], ny = [n]; otherwise the spelling 
reflects the pronunciation directly.) 


(10.9) a. bon-a ‘see’ bon-w-a ‘be seen’ 
shay-a ^ 'beat shay-w-a ‘be beaten’ 


b. gubh-a ‘hollow’ | guj-w-a ‘be hollowed’ 
khiph-a ‘takeout’ =khish-w-a “be taken out 
lum-a ‘bite’ luny-w-a ‘be bitten’ 
bamb-a ‘catch’ banj-w-a ^ ‘be caught’ 
(Ziervogel et al. 1981: 106—7, 160, 163) 


The original alternation can be summarized as in (10.10). The rule applies 
wherever these conditions are met. 
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(10.10) Labial Palatalization 


bh j 

ph sh 
m +w > (ny 
mb nj 
mp ntsh 


Interestingly, Zulu speakers evidently reinterpreted the alternation as co- 
signalling the passive meaning and introduced it into words that could 
never have developed palatals by phonological processes. Such words are 


shown in (10.11). 


(10.11) khumul-a ‘loosen’ khunyul-w-a ‘be loosened’ 
khumbul-a ‘remember’ khunjul-w-a “be remembered’ 
bophel-a ‘harness’ boshel-w-a ‘be harnessed’ 
gijimis-a ‘make run’ gijinyis-w-a ‘be made to run’ 
bophis-a ^make fasten' boshis-w-a ‘be caused to fasten’ 


(Ziervogel et al. 1981: 106-7, 160, 163) 


Inall these verbs, the root-final labial consonantis followed by some segments 
that would have protected it from undergoing Labial Palatalization as a 
sound change. The fact that it was extended to these cases shows that the 
alternation was reanalyzed as being morphologically conditioned. 


10.4 Morphophonology as phonology 


With this foundation, we are now ready to address the following questions 
related to formal description of language architecture: Do automatic and 
morphophonological alternations reflect morphological structure or 
phonological structure? How should we describe the interface between 
morphology and phonology? 

One position, connected with the morpheme-based model, holds that 
both automatic and morphophonological alternations are generated 
by phonological structure? In this view, alternations are the result of 
phonological rules induced by affixation and in formal terms are not 
therefore considered markers of morphological function or meaning. To see 
why this approach is appealing, it is useful to look in some detail at one 
particular proposal of this type, known as level ordering. 

Level ordering is rooted in the observation that in some languages, it is 


Linguists who adhere to this position sometimes use phonological alternation as a general 
term for both automatic and morphophonological alternations, implying that both belong 
in the same component of grammar. Since this is a controversial issue, we avoid the term 
in this book in favour of the more neutral (and shorter!) term alternation. 
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useful to distinguish between two types of affixes, depending in large part 
on their behaviour with respect to morphophonological alternations. We 
call these integrated and neutral affixes here. Their typical properties are 
summarized in Table 10.3. We begin by briefly examining integrated and 
neutral affixes in two languages, Lezgian and English. 


Integrated affixes Neutral affixes 

are in the domain of stress are not in the domain of stress 
assignment assignment 

trigger and undergo do not trigger or undergo 
morphophonological alternations morphophonological 

alternations 

words with integrated affixes words with neutral affixes may 

show the phonotactics of show phonotactic peculiarities 


monomorphemic words 


tend to occur closer to the root tend to occur further from the root 


Table 10.3 Integrated and neutral affixes 


In Lezgian, most inflectional suffixes are neutral, but some are integrated 
(all prefixes are integrated, but there are so few of them that they can be 
neglected here). To see the difference between the two types of suffixes, 
we need to consider the rule of stress assignment and two relevant 
morphophonological alternations (see (10.12)). 


(10.12) a. Lezgian Stress Rule: Stress is on the second syllable in the stress 
domain, if there are at least two syllables in it. 


b. Aspirate Ejectivization: A word-final voiceless aspirate 
consonant (spelled Ch, where C stands for any consonant letter) 
alternates with an ejective (spelled C’) if the plural suffix follows: 
meth  met’-ér ‘knee(s)’ 
neth net-ér — 'louse/lice' 
wakh | wak-ár 'pig(s) 
hagh hag-ár —'truth(s) 


c. Vowel Harmony: The stressed syllable and the pre-stress 
syllable agree in backness and, in the case of high front vowels, 
labialization — i.e. the only allowed sequences of unlike vowels 
are a-u, u-a, i-e, e-i, ii-e, e-ti. (Disallowed are a-e, e-u, i-i, etc.; 
note that Lezgian has the five vowel phonemes a, e, i, u, ü.) The 
suffix vowels a/e and i/u/ii alternate: 
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qal  q'lár  'stick(s ^ Ball &al-üni 'thread' 


qul  q'ulár ‘board(s)’ Cul č'ul-úni ‘belt’ 
qil  qilér ‘head(s)’ ric ric -íni ‘bowstring’ 
ql  q'ül-ér 'dance(s)  q'ül q ül-üni 'dance' 


(Haspelmath 1993: 56-8) 


The suffixes -er/-ar (plural) and -uni/-ini/-üini (oblique stem) that are illustrated 
in (10.12) are examples of integrated suffixes. As the examples show, they are 
in the stress domain (in accordance with the Stress Rule, they receive stress 
when they attach to a monosyllabic base) and they trigger and undergo 
morphophonological alternations ((10.12b) and (10.12c), respectively). 

Besides these, Lezgian also has neutral plural suffixes and neutral 
oblique-stem suffixes, as illustrated in (10.13). 


(10.13) a. Lezgian oblique-stem suffix -di (neutral) 
fil fil-di ‘elephant’ 
tip tip-di ‘type’ 
nur núr-di ‘beam’ 
din dín-di ‘religion’ 


b. Lezgian plural suffix -ar (neutral)? 

tip tip-ar ‘type(s)’ 

kür kür-ar 'shed(s) 

kar kár-ar ‘enclosure(s)’ 

li li-jar ‘hide(s)’ 

(Haspelmath 1993: 68-9) 

These are not in the stress domain - the stress is on the first syllable in these 
words, contrary to the Stress Rule—and they neither undergo any alternations 
(in particular, they are not subject to vowel harmony) nor do they trigger 
them. Integrated suffixes always follow the root immediately, whereas 
neutral suffixes may also come after a derivational suffix. For instance, the 
noun Cecen-wi ‘Chechen person’ (derived from cecen ‘Chechnya’) has the 
plural cecen-wi-jar and the oblique-stem suffix -di (Cecen-wi-di). 

Lezgian words with neutral suffixes are immediately recognizable 
as morphologically complex: consonant sequences like pd (in tipdi) are 
impossible morpheme-internally. By contrast, if we disregard meaning, all 
words in (10.12b-c) could be monomorphemic in principle. 

Let us now look at English, where the distinction between integrated and 
neutral affixes has occupied many morphologists and phonologists. Some 
examples of both types of affixes are given in (10.14). 


(10.14) integrated affixes: -ity, in-, -ical, -ion, -ian, -al, -y1, -ous, -ive 
neutral affixes: -ness, un-, -ly, re-, -ize, -able, -ful, -y2, -ism 


° The neutral plural suffix -ar has the same shape as one of the alternants of the integrated 


plural suffix -er/-ar, but it is a distinct suffix. 
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Integrated suffixes often lead to a stress shift, whereas neutral suffixes 
never do: 


(10.15) BASE WITH INTEGRATED BASE WITH NEUTRAL 
SUFFIX SUFFIX 
réal reality natural naturalness 
comedy comédian accompany accómpaniable 
photograph — photography (-y1) rickets rickety (-y2) 
advantage advantageous bounty bountiful 


Integrated suffixes may also trigger Trisyllabic Shortening (cf. (10.2a)), 
whereas neutral suffixes never do. The integrated prefix in- shows Nasal 
Assimilation of the n to the first consonant of the base (elegant/inelegant, but 
literate/illiterate, regular/irregular), whereas the n of the neutral prefix un- is 
always preserved (unlimited, unrealistic, etc.). The attachment of neutral affixes 
may lead to the violation of morpheme-internal phonotactic constraints — e.g. 
cleanness and unnecessary show two consecutive instances of [n], something 
that never occurs within a morpheme. Likewise, the suffix -ful brings about 
consonant sequences such as [pf] (e.g. hopeful) and [kf] (e.g. thankful) that 
do not occur morpheme-internally. The integrated affixes, by contrast, only 
create combinations that are independently possible morpheme-internally. 
And finally, English shows a strong tendency for integrated suffixes to occur 
close to the root, whereas neutral suffixes occur further away from the root. 
Integrated affixes do not, as a rule, attach to words derived by a neutral affix 
(*[hope-ful]-ity, *in-[friend-ly], *[kind-ness]-ical), whereas the opposite order is 
unproblematic ([natur-al]-ness, un-[product-ive], [Rastafari-an]-ism). 

The contrast between integrated and neutral affixes in English gave rise to 
the idea that the innate architecture of the grammar provides the possibility 
of several levels of affixes that are linked to particular (morpho)phonological 
rules. English derivation would have two levels, commonly called levelIand 
level II (see Table 10.4). The basic idea is that the levels are ordered relative 
to each other, and within each level rules introducing affixes are ordered 
relative to phonological rules: sets of derivational affixes are paired with sets 
of phonological rules that apply after the affix has been introduced. 


Level Affixes (Morpho)phonological 
rules 

level I: -ity, in-, -ical, -ion, -ian, ^ Trisyllabic Shortening 

(= integrated affixes) -al, -y1, -ous, -ive Stress Assignment 
Nasal Assimilation 

level II: -ness, un-, -ly, re-, -ize, Flapping 

(= neutral affixes) -able, -ful, -y2, -ism 


Table 10.4 The two levels of English morpho(phono)logy 
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This architecture requires level I affixes to be attached before level II 
affixes. In this way, level ordering explains the restriction that prohibits 
integrated affixes from attaching to words with neutral affixes. Sample 
derivations are given in (10.16). 


(10.16) nation acid debate create 
Root ‘nejfn — 'zsid do’bejt kri'eit 
Level I Morphology ^nejfn-] 'zesrd-rti kri’eit-1v 
Level I Phonology 

Trisyllabic Shortening ‘nzefn-l 

Stress Assignment ee'siditi 
Level II Morphology do'bejt-obl kri'eitrv-nes 
Level II Phonology 

Flapping e'sirii do'bejrobl — kri’eirtvnes 
Surface ['nzefnl] [e'srriri] [do'bejrobl] [kri'eirrvnes] 
Representation national acidity debatable ^ creativeness 


In addition, it explains why neutral affixes are not affected by 
morphophonological rules. Note that the word-forms national and chastity 
exhibit Trisyllabic Shortening. But in debatable and creativeness the rule does 
not apply, even though both have surface forms that meet the conditions for 
the rule. In a level ordering account, this is because Trisyllabic Shortening 
applies at level I, whereas for debatable and creativeness, the conditions for 
the morphophonological rule are met only at level II. 

The level ordering approach thus has appealing powers of 
explanation. Moreover, it has been argued that level ordering has an 
added advantage in being a restrictive hypothesis about morphological 
architecture: like other models that posit only concatenative rules, it 
disallows many types of morphological patterns that are not observed in 
the world's languages.? Not surprisingly, level ordering has been influential 
among generative morphologists and phonologists. 

At the same time, when the details are pinned down, application 
of level ordering to English encounters some serious problems. Some 
counterexamples to the ordering restriction are obvious and widely 
recognized: the level I suffix -ity can attach to the level II suffix -able as in 
readability, and -ation (a variant of -ion) can attach to -ize (e.g. realization). 
There are also problems with the pairing of affixes and rules. For example, 
the rule of Velar Softening (which changes underlying [k] into [s], and [g] 


Note that there are multiple senses of the term derive. In Chapter 3 and elsewhere, we have 
used this term mostly to describe a static relationship between complex word-forms and 
their bases. Here, however, the term is intended in the sense of 'derivational phonology’ - 
constructing surface representations from underlying representations by applying a series 
of rules. 

5 Atleast, it is restrictive in this way if zero affixes are excluded. See Chapter 3 for a discussion 
of restrictiveness as a goal in morphological analysis, and problems with zero affixes. 
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into [d3] before certain suffixes — e.g. electric/electricity) is a clear example of 
a morphophonological rule that should go with level I affixes. And, indeed, 
many level I affixes do trigger this rule (e.g. analogous/analogy, music/ 
musician, opaque/opacity), but there are also two level II suffixes that trigger 
it, -ize and -ism (e.g. public/publicize, fanatic/fanaticism). Also with respect to 
stress, integrated and neutral affixes may behave alike: words prefixed with 
in- and un- both share the same stress pattern, with secondary stress on the 
prefix (unnatural, unafraid, imprecise). This stress pattern contrasts with that of 
monomorphemic words like innocent, impudent, infidel. Thus, in this respect 
in- behaves as we would expect from a level II prefix (Raffelsiefen 1999b). 
Now, a few counterexamples do not in general invalidate a generalization, 
but if the generalization is supposed to be a direct consequence of the 
architecture of the grammar, counterexamples do become a big problem, 
because there is no way in which they could arise if the system of Table 10.4 
is assumed. 

Even more damaging to the level ordering hypothesis is the fact that 
there appears to be an alternative explanation for the observed ordering 
restrictions. Most integrated affixes in English are quite unproductive 
anyway, so it seems unnecessary to invoke a level ordering architecture 
in order to explain why they do not attach to words derived with neutral 
affixes. The integrated affixes were borrowed along with complex words 
from French or Latin, and most of them never became truly productive in 
English. Even the most common suffix, -ity, cannot in general be used with 
new bases (cf. *chivalrosity, *naturality, ?"effectivity), only in the special case 
of adjectives derived by -able (readability, bagelizability, etc.). True, within 
strict limits it is sometimes possible to form new words with the integrated 
affixes; for instance, use of telescopy, grammophonic and credentious can 
be found with a Google search (telescopy is by far the most frequent of 
these three). Still, limited productivity probably goes a long way towards 
explaining affix ordering restrictions. 

Factors that affect the saliency of the morpheme boundary may also 
play a role. Remember that a word with an integrated affix has the same 
phonotactic structure as a monomorphemic word. The internal morpheme 
boundary is thus less salient in some sense. According to one proposal, this 
(among other factors) tends to make it more efficient for an English speaker 
to store a stem and integrated affix together as a single unit in the lexicon. 
(See Chapter 4 and Section 6.4.1 for details about lexical storage and its 
interaction with morphological structure.) And lexical storage, in turn, has 
consequences for affix ordering: since a stem and integrated affix constitute 
a unit, integrated affixes must occur closer to the root than neutral affixes 
(Hay and Plag 2004). Furthermore, in this complexity-based ordering 
hypothesis, integration is a matter of degree, which helps explain ordering 
restrictions within the two categories of affixes. More integrated affixes 
must occur closer to the root than less integrated affixes, and so on. Level 
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ordering incorrectly predicts that there should be no ordering restrictions 
within a given level. 

In the following section it is shown that the problems not only concern 
the hypothesis of level ordering, but in fact extend to the more general claim 
that morphophonological alternations fall into the domain of phonology. 


10.5 Morphophonology as morphology 


The alternative hypothesis — that morphological alternations properly 
belong to the domain of morphology — begins from the observation that 
these alternations behave in ways that are typical of morphological structure 
more generally. For instance, we have already seen that morphophonological 
alternations can vary in productivity. This is a typical property of affixes, 
but not of phonological rules. Alternations can also serve as the basis for 
back-formation. An example of this comes from Polish. 

A widespread (and productive) alternation in Polish is the First 
Palatalization, whose effects are shown in (10.17). (Note that this is somewhat 
different from the Second Palatalization, which we saw in Section 10.2, and 
which occurs in different environments.) 


(10.17) [k] [tf] k CZ 
gl > [3] (spelling: I4 > Z 
[x] Ul ch SZ 


— 


The First Palatalization occurs, for instance, with the verb-deriving suffix 
-y& with the adjective-deriving suffix -ny and with the diminutive suffixes 
-ek and -ka: 


(10.18) kaleka ‘cripple’ kaleczy¢é ‘mutilate’ 
dynamika ‘dynamics’ dynamiczny ‘dynamic’ 
poriczocha ‘stocking’ ^ poficzoszka “little stocking’ 
krag ‘circle’ krazek ‘little circle’ 


Polish has a productive pattern of back-forming words from non-diminutive 
words ending in -ek or -ka. These derivatives get an augmentative 
interpretation, as in (10.19). 


(10.19) ogórek ‘cucumber’ ogór ‘big cucumber’ 

szpilka ‘pin’ szpila ‘big pin’ 
Now when this rule of subtractive augmentative formation is applied to 
words ending in -szka or -czka, the result is a new word ending in -cha or -ka: 
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(10.20) broszka ‘brooch’ brocha ‘big brooch’ 
flaszka ‘bottle’ flacha ‘big bottle’ 
gruszka ‘pear’ grucha ‘big pear’ 
Agnieszka (name) Agniecha ‘big Agnieszka’ 
beczka ‘barrel’ beka ‘big barrel 
taczka ‘wheelbarrow’ taka ‘big wheelbarrow’ 


The words in the left-hand column in (10.20) all have [f] (sz) and [tf] (cz) 
originally. For example, broszka was borrowed from French broche [brof], 
flaszka was borrowed from German Flasche and gruszka was derived from 
grusza ‘pear tree’. The ch/k in the back-formed augmentatives is clearly 
new, and it shows that morphophonological alternations can operate in 
the reverse direction under certain circumstances. In this respect, these 
alternations are just like morphological rules. 

Based on this kind of data, many linguists would say that only automatic 
alternationsaretruly phonological, whereas morphophonological alternations 
are really morphological in nature. This means that morphophonological 
alternations can signal (or co-signal) morphological meaning. From the 
perspective of language change, an automatic alternation that becomes 
closely associated with a morphological pattern is susceptible to being 
reanalyzed as part of the morphology, and it can then be expected to behave 
like other ingredients of morphological patterns. 

For instance, it is possible for a morphophonological alternation to 
become the sole formal marker of a pattern — e.g. when the original marker 
disappears for phonological reasons. This has happened in Modern Irish, 
where the past tense of verbs is marked by Lenition of the initial consonant. 
Lenition involves fricativization and some other changes and originally it 
occurred only in intervocalic position (like Hebrew Spirantization (10.2d)). 


(10.21) Modern Irish Lenition 
{ k,g, t,d, p,b,s, f} > {x, y,h, y, f, w, h, Ø} 
(spelling: c, g, t, d, p, b, s,f — ch, gh, th, dh, ph, bh, sh, fh) 


(10.22) PRESENT TENSE PAST TENSE 
molaim mhol mé ‘I praise(d)’ 
brisim bhris mé ‘I break /broke’ 
sábhálaim shábháil mé 'I save(d) 
díbrím dhíbir mé ‘I banish(ed)’ 


The past tense was originally formed with a prefix do-, but this was lost, and 
nowadays only the Lenition is a signal of the past tense (but there are also 
different person-number markers, -(a)im and mé for first person singular). 
The result in Modern Irish was thus a base modification pattern, and we 
saw similar cases earlier in the book: German plurals signalled solely by 
the Umlaut (e.g. Mutter, Mütter ‘mother(s); see (3.1)), and Albanian plurals 
signalled solely by Palatalization (e.g. [fik], [fic] ‘fig(s); see (3.3)). The loss of 
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the original affixes makes sense here if the morphophonological alternations 
co-signalled the relevant morphological meaning, i.e. if they were part of 
the morphological system of expression. These data seem to further support 
a morphological approach to morphophonological alternations. 

Furthermore, as pointed out already in Chapter 3, the existence of base 
modification without (overt) affixationhasled many linguiststoconclude that 
we cannot reasonably reduce the operation of morphology to concatenation 
— even if morphophonological alternations are factored out of the equation. 
An adequate description of base modification patterns still requires either 
non-concatenative operations or zero affixes. And if non-concatenative 
operations must be posited anyway, nothing is really sacrificed by positing 
that the same kinds of principles govern morphophonological alternations 
as well. And since zero affixes are virtually unrestricted theoretical devices, 
shuttling morphophonological alternations off to the phonology does not 
clearly lead to a simpler, more restrictive morphology (despite frequent 
assumptions to the contrary). Theoretical concerns thus also seem to come 
down on the side of morphology. 

Now, a word of caution is in order here, lest we think that the data unques- 
tionably favour the morphological approach. The need for non- 
concatenative operations in the morphological system does not entail that all 
morphophonological alternations are morphology. A description of morphology 
that includes only concatenative rules forces morphophonology to be treated 
as phonology, but a model that allows non-concatenative morphological 
operations is, in principle, compatible with both a morphological and a 
phonological approach to these alternations (although perhaps more naturally 
aligned with the former). The key question thus becomes: Is phonology 
ever sensitive to morphological structure? If not, we can safely conclude 
that morphophonological alternations necessarily belong to the domain of 
morphology. But if not, we must be more careful in our conclusions. 

It turns out that there is at least one good candidate for morphology- 
sensitive phonology: phonotactic constraints at morpheme boundaries. 
Affixation sometimes creates combinations of sounds that are not allowed 
morpheme-internally (e.g. pd in Lezgian tip-di 'type', the consonant 
cluster [ksts] in English text-s). More violations of morpheme structure 
conditions were presented in Section 42, including a pattern in Standard 
Northern Italian whereby an alternation between [z] and [s] ([z] between 
vowels and [s] elsewhere) is violated morpheme initially, e.g. a[s]immetrico 
‘asymmetric’. So it seems that phonotactic constraints are sometimes 
sensitive to morphological structure. And given that phonotactics are 
considered a core aspect of phonological structure, it may turn out that little 
is sacrificed by allowing morphologically-conditioned phonological rules, 
i.e. (some) morphophonological alternations as affix-induced phonology. 

The proper place of morphophonology has not been conclusively settled. 
As should be obvious from the discussion in this chapter, analyses of the 
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morphology-phonology boundary depend crucially on a host of other 
assumptions about how morphology and phonology function. As long 
as debates in those areas continue, morphophonological alternations will 
probably continue to be a contentious issue. 


Summary of Chapter 10 


Two types of alternations can be distinguished: automatic alternations 
and morphophonological alternations. They differ in a variety of 
ways: automatic alternations show clear signs of phonetic motivation, 
may be optional and may apply across word boundaries, whereas 
morphophonological alternations have lost their connection to 
phonetics, have morphological conditioning, are obligatory and apply 
within words. Morphophonological alternations vary in productivity. 
Diachronically, automatic alternations turn into morphophonological 
alternations. Some languages have two types of affixes (here called 
neutral and integrated), depending on the type of alternation that 
they trigger. 

It is widely agreed that automatic alternations are phonological 
in nature, but the status of morphophonological alternations 
is controversial. A strict morpheme-based model requires 
morphophonological rules to belong to the phonological component 
of the grammar. For instance, in a level ordering account, 
morphophonological and automatic alternations (and neutral and 
integrated affixes) correspond to different rule blocks that are ordered 
relative to each other. However, this approach encounters a number of 
serious theoretical and empirical problems. The alternative hypothesis 
- that morphophonological alternations belong to the morphological 
component - is appealing because it can account for commonalities 
that morphophonology has with morphology. 


Further reading 


Alternations and derivational phonology are discussed in every phonology 
textbook (e.g. Gussenhoven and Jacobs 2005: ch. 6). The most influential 
work in derivational phonology is Chomsky and Halle (1968). 

The view that only automatic alternations are phonological and 
morphophonological alternations really belong to the morphology is 
highlighted in Hooper (1976) and Bochner (1993), among many others. 
The opposing view is defended in Mohanan (1995) and Kiparsky (1996). A 
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variety of approaches to morphophonology are discussed in the papers in 
Singh (1996). 

For diachronic change from phonological to morphophonological rules, 
see Wurzel (1980) and Joseph and Janda (1988). 

For the theory of level ordering (also called ‘Lexical Phonology’), see 
Kiparsky (1982, 1985) and Kaisse and Shaw (1985). Optimality Theory is a 
constraint-based framework that has been developed to explore some of the 
same interactions between morphology and phonology that motivated level 
ordering. Major references include Prince and Smolensky (1993), McCarthy 
and Prince (1993a, b), and Kager (1999). (Classical OT rejects the principle 
of level ordering, but it is worth mentioning Stratal Optimality Theory 
(Kiparsky 2000), which has adopted some of its fundamental insights.) 

Kiparsky (1982), Fabb (1988), Aronoff and Fuhrhop (2002), Hay and Plag 
(2004) and Plag and Baayen (2009) investigate affix order restrictions (in 
English) and their relation to productivity and other factors. 

The most comprehensive book on morphophonology is Dressler (1985). 
Important insights on morphophonology are found in Bybee (1985, 2001). 


Comprehension exercises 


1. Isthe voicing alternation of English fricatives in 
leaf/leaves 
knife/knives 
house/houses, etc. 


an automatic or a morphophonological alternation? 


2. English has a morphophonological alternation of [n] and [ng] - e.g. 
young [jan], younger [jangr]. Is this a relic alternation, a common 
alternation or a productive alternation? 


3. Decide whether the following alternations are automatic or 
morphophonological, on the basis of the (necessarily incomplete) 
information given here. 


a. In Hausa, the alveolars t, d, s, z palatalize to c [tf], j [d3], sh [f], j 
[d3] when they occur before an original front vowel (Newman 2000: 


414-15): 

kaazaa ‘hen’ kàajii ‘hens’ 

ciizaa ‘bite’ ciiji ‘bite’ (imperative) 
Hausa "Hausa" Bahaushée ‘Hausa person’ 


gwadaa ‘measure’ = gwajii ‘experiment’ (deverbal noun) 
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Recent sound changes have created new cases of ee and i: 


original form current form 

ai>ee taiba > téeba ‘cooked cassava flour’ 
Koosai > Koosee ‘fried beancake’ 

u>i  tukaatukii > tikaatikii ‘calf, shin’ 


Some English loanwords: 
laasiisii ‘licence’ 

teebur ‘table’ 

gazet ‘gazette’ 


b. In Spanish, the voiced stops b, d, g alternate with the fricatives [D, ð, 
y] if a vowel or fricative precedes them: 
el dedo [el deóo] ‘the finger’ 
los dedos [loz óeóos] ‘the fingers’ 
Damiano viene [damjano Bjene] ‘Damiano is coming’ 
viene Damiano [bjene damjano] ‘Damiano is coming’ 


c. In Modern Greek, the velar phonemes [k], [g], [x], [y] alternate with 
the palatal phonemes [c], [7], [c], [j] whenever they precede a front 
vowel ([e] or [i], e.g. 


1SG steko exo 
25G  stecis ecis 
3SG steci eci 


IPL stekume exume 

2PL  Sstecete ecete 

3PL stekun | exun 
‘stand’ ‘have’ 


Some loanwords: [cinino] ‘chinine’, [J-emi] ‘reins’ (from Turkish gem). 


d. In Polish, the vowel [o] alternates with [u] (spelled 6) in certain 
morphological forms when the morpheme-final consonant does not 
start a new syllable, e.g. 
głowa  ‘head.NoM.sc’ głów ^ “head.GEN.PL’ 
głodu ^ 'hungerGEN.sG! głód ^ 'hunger.NOM.sc' 
woda ‘water’ wódka ‘vodka’ 


However, there are numerous exceptions to this rule, not just 


loanwords: 

spora 'Spore.NOM.SG' spor 'Spore.GEN.PL' 
kodu 'code.GEN.sc' kod 'code.NOM.sG' 
wódeczka ‘little vodka’ wódka ‘vodka’ 


4. We saw that Zulu Labial Palatalization is a morphophonological 
alternation (and not an automatic alternation), because it is tied to 
particular morphological contexts. What other criteria can be invoked 
to support that conclusion? 


Morphology 
and valence 


S? far we have focused our attention primarily on form-related aspects 
of morphology. But this chapter will be entirely devoted to one type of 
function of morphological patterns. We will examine various ways in which 
morphology can affect valence — i.e. the expression of arguments in verbs 
and deverbal formations. We will first look at valence-changing operations 
such as passives and causatives (Section 11.1), then move on to the way 
in which valence is affected by compounding (Section 11.2), and finally 
discuss what happens to verbal arguments in transpositional derivation 
(i.e. derivational patterns that change the base's word-class) (Section 11.3) 
and transpositional inflection (Section 11.4). 


11.1 Valence-changing operations 


11.1.1 Semantic valence and syntactic valence (argument 
structure and function structure) 


Most verbs are associated with one, two or three arguments as part of 
their lexical entries (verbs with zero or more than three arguments are 
very rare, and many languages lack them completely). When we know a 
verb's meaning, we also know the semantic roles (alternatively, thematic 
relations) of the participants of the verbal event. For example, a verb that 
means 'eat' will have an agent participant (the entity doing the eating) 
and a patient participant (the thing being eaten) in all languages, a verb 
meaning ‘please’ will have an experiencer and a stimulus participant, and 
a verb that means 'steal, rob' will have an agent, a theme (the thing that is 
taken away) and a source participant. But this knowledge is not sufficient 
if we want to use these verbs, because the syntactic functions (such as 
subject, object, oblique) by which these participants are expressed differ 
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from language to language and from verb to verb. As a concrete example, 
the semantic-role structures and the syntactic-function structures of five 
English verbs are given in (11.1). 


(11.1) a. eat: SBJ — OBJ 
| | 
agent patient 
(Robert ate a mango.) 


b. like: SBJ — OBJ 
| | 
experiencer stimulus 
(I like this song.) 


c. please: sBy | — OBJ 
| | 
stimulus experiencer 
(This song pleases me.) 


d. steal: SBJ — OBJ —  osLfrom 
| | | 
agent theme source 
(Baba stole my bike from me.) 


e. rob: SBJ — OBJ — OBLOf 
| | | 
agent source theme 
(Baba robbed me of my bike.) 


The verbs please and like, and the verbs steal and rob, are roughly 
synonymous, so there is no way to predict their different behaviour from 
their meanings. Hence speakers must store not only the meaning of every 
verb, but also the syntactic functions that are associated with the semantic 
roles. Thus, the lexical entries of the verbs please and rob would look as in 
(11.2). 


(11.2) a.| /pliz/v b. | /rob/v 
SBJ — OBJ SBJ — OBJ — OBLof 
| | | | | 
stimulus experiencer agent source theme 
‘please’ ‘rob’ 


The information that these entries contain in addition to the pronunciation, 
the word-class and the meaning is called the valence of the verb. The 
valence has two parts: the syntactic-function structure (‘syntactic valence’, 
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also called simply function structure), and the semantic-role structure 
(‘semantic valence’, also called argument structure)? 

The argument structure can in principle be derived from the meaning 
(or conceptual structure, or event structure) of a verb. For example, a 
formal decomposition of the meaning of steal or rob looks as in (11.3) (see 
Jackendoff 1990). 


(11.3) [cause ([A], [coross ([B], [rrom ([C])])]) 
{BY-FORCE} 


(11.3) can be paraphrased as ‘A causes B to go from C’s possession by 
force’ — i.e. A robs C of B. The participant A must be an agent because 
it is the first role of the semantic element CAUSE; the participant B must 
be a theme because it is the first role of the semantic element Go; and C 
must be a source because it is the participant of FROM. Thus, it would in 
principle be possible to formulate the linking rules as direct links between 
the conceptual structure and the syntactic-function structure. The lexical 
entry of the verb steal would then be as in (11.4), where there is no separate 
argument structure. 


(11.4) |/stil/v 
SBJ OBJ OBLfrom 
| | | 
"CAUSE ([A], [coross ([B], [rrom ([C])])])’ 
{BY-FORCE} 


Although it is actually quite likely that the format of (11.4) is closer to the 
truth than the format of (11.2), in the present context a practical problem is 
that there is much less agreement about the right form of the conceptual 
decomposition of verb meanings than about semantic roles. Thus, we will 
mostly continue to use the simplified format of (11.2), bearing in mind that 
this is just an abbreviation and that the complete picture requires a more 
elaborate specification of verb meaning along the lines of (11.3). 

Now morphological operations may change the valence of a verb in two 
different ways. On the one hand, they may change the linking of semantic 
roles to syntactic functions. Such operations are called function-changing 
operations (or voice). On the other hand, they may change the conceptual 


The most important syntactic functions are subject (s5j), (direct) object (oBy) and oblique 
(OBL — i.e. adpositional phrases and phrases in oblique cases). Two further functions that 
are needed less commonly are indirect object (105j) and adverbial (apv). A syntactic 
function that is needed for noun phrase structure is possessor (Poss). 

? The links between argument-structure positions and function-structure positions 
(indicated here by lines) are governed by a set of rules that have been extensively 
discussed by syntacticians and that we cannot go into here. The crucial point is that the 
rules cannot be entirely independent of knowledge of particular verbs. At least for some 
verbs such as like and please the function structure must be part of the lexical entry because 
it is idiosyncratic. 


11.1 VALENCE-CHANGING OPERATIONS 237 


structure (or event structure) of the verb in such a way that the argument 
structure is affected. We will refer to such operations as event-changing 
operations. Examples of both subtypes of valence-changing operations will 
be seen in the following sections. 


11.1.2 Agent-backgrounding operations 


The best-known function-changing operation is the passive, where the 
agent is backgrounded in that it is no longer the subject. Instead, the patient 
usually becomes the subject. English has a passive construction, of course: 
compare the active sentence Mark wrote a letter with the corresponding 
passive one The letter was written (by Mark). The passive in English has a 
complicated form, however, (involving both an auxiliary and a participle), 
so we will look here at the Chichewa passive, which is more clearly 
morphological and therefore serves our purposes better (the Chichewa 
type is far more common in the world's languages anyway). Examples of 
an active and a passive sentence from Chichewa are given in (11.5). 


(11.5) a. Naphiri a-na-lemba kalata. 
Naphiri — 3sc-Psr-write letter 
'Naphiri wrote a letter." 
b. Kalata i-na-lemb-edwa (ndi Naphiri). 
letter 3SG-PST-write-PAss by Naphiri 


‘The letter was written (by Naphiri).’ 
(Dubinsky and Simango 1996: 751-2) 


In Chichewa, the passive is marked by the suffix -idw/-edw, which is attached 
directly to the verb stem (the ending -a is a stem extension that need not 
concern us here). Its syntactic effect is that the patient is linked to the subject 
function and the agent is linked to the oBLndi function. As in English, the 
oblique agent is optional, as is indicated by the parentheses. Thus, we can 
formulate the rule for passivization as in (11.6). 


(11.6) |/Xa/v /Xidwa/v 
SBJ — OBJ (OBLndi)  — SBJ 
| | | e | | 
agent patient agent patient 
‘dox’ ‘be donex’ 


Here, all that changes is the phonological form of the verb and the function 
structure of the verb (as well as the linking to the semantic roles). The 
argument structure is unaffected — it still specifies both an agent participant 
and a patient participant. Even when the oblique agent is omitted, it is 
still present implicitly: the sentence kalata inalembedwa means that some 
unspecified agent wrote the letter (not just that some agentless letter writing 
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took place), as is clear from a sentence like (11.7), where the adverb mwadala 
‘deliberately’ presupposes such an agent. 


(11.7) Chitseko chi-na-tsek-edwa mwadala. 
door 3SG-PsT-close-PAss deliberately 


‘The door was closed deliberately.’ 
(Dubinsky and Simango 1996: 751) 


The passive is thus a prototypical example of a function-changing operation, 
or voice. 

A clear example of an event-changing operation is the anticausative, 
where the agent-backgrounding is much more radical than in the passive. 
Here, the agent is completely removed from the argument structure. An 
example comes from Russian, where the anticausative is expressed by the 
suffix -sja/-s'. 


(11.8) a. Vera zakryla dver’. 
Vera.NOM closed door.acc 
"Vera closed the door.’ 


b. Dver' zakryla-s'. 
door.NOM closed-ANTIC 
"Ihe door closed." 
(11.9) | /X/v /Xsja/v 
SB — OBJ SBJ 
| | e| | 
agent patient patient 


£ 


“CAUSE ([A], [BECOME ([STATEx ([B])])]) “BECOME ([STATEx ([B])])’ 
In (11.9) we see that not only is the agent removed from the argument 
structure, but also the cause element is eliminated from the conceptual 
structure (hence the term ‘anticausative’). It is in this sense that the 
anticausative is event changing and not merely function changing. The 
function change (patient becoming subject) is an almost trivial consequence 
of the main function of the anticausative. That the agent is not present in 
the argument structure (and in the verb meaning) can also be seen from 
the fact that it cannot appear as an oblique argument (*Dver' zakrylas’ Veroj 
‘The door closed by Vera’), and no agent-oriented adverbials may occur 
in the sentence (*Dver’ zakrylas’ namerenno "The door closed deliberately’; 
this sentence is possible only in an unlikely world in which doors have 
intentions). 

An even more radical change in the event structure of the verb is effected 
by the resultative (or stative) operation, which removes not only the ‘cause’ 
part of the event structure together with the agent, but also the ‘become’ part. 
An example of a resultative (marked by the suffix -ik/-ek) from Chichewa, 
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which contrasts with the passive in (11.7), is given in (11.102). The active 
and resultative event structures are given in (11.10b). 


(11.10) a. Chitseko chi-na-tsek-eka. 
door 3SG-PST-close-RESULT 
‘The door was closed (= in a closed state). 


b. ‘cause ([A], [BECOME ([CLOSED ([B])])])’ e ‘crose ([B])’ 


As in the Russian anticausative, neither an oblique agent nor an agent- 
oriented adverb is permitted (*Chitseko chinatsekeka ndi Naphiri ‘The door 
was in a closed state by Naphiri’; *Chitseko chinatsekeka mwadala ‘The door 
was ina closed state deliberately’) (Dubinsky and Simango 1996: 751). Note 
that an interesting feature of the anticausative and resultative operations is 
that they are semantically subtractive — i.e. the derived form removes part 
of the conceptual structure of the base. 

Finally, another example of a valence-changing operation is the reflexive, 
where the agent and the patient are coreferential and hence can be thought 
of as occupying a single syntactic function. Examples of an active and a 
reflexive verb in Eastern Armenian are given in (11.11a) and (11.11b), 
respectively, and the rule is given in (11.12)? 


(11.11) a. Mayr-o lvan-um e Seda-yi-n. 
mother-ART | wash-PRs AUX  Seda-DAT-ART 
‘Mother is washing Seda.’ 


b. Seda-n lva-co-um e. 
Seda(NOM)-ARTWash-REFL-PRS AUX 
‘Seda is washing (herself).’ 
(Kozinceva 1981: 83) 


(11.12) |/Xnum/v /Xcvum/v 
SB] — OBJ SBJ 
| | e 
agent patient agénti patienti 
‘A actsx on B’ ‘A actsx on self’ 


In the reflexive voice, the meaning of the verb remains the same, but it is 
specified that the agent and the patient are coreferential (indicated in the 
right-hand word-schema in (11.12)). Thus, although the reflexive is not 
really event-changing, its effect is not strictly limited to function changing 
either. It is thus a borderline case between the two subtypes of valence- 
changing operations. 


* Coreferentiality is conventionally indicated by subscript letters. So, the subscript i in both 


agent, and patient, in (11.12) indicates that the agent and patient refer to the same entity. 
Different subscript letters are used when two participants are not coreferential. 
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11.1.3 Patient-backgrounding operations 


Antipassive is the term for a morphological operation whose effect is to 
background the patient in much the same way as the agent is backgrounded 
in the passive. An example of an active and an antipassive construction 
from West Greenlandic is shown in (11.13a—b). Note that the oblique patient 
is marked by the instrumental case in West Greenlandic. The relevant part 
from the antipassive rule is given in (11.14). 


(11.13) a. Qimmi-p inu-it tuqup-pai. 
dog-ERG.sG  person-ABs.PL kill-3sG.sBJ/ 3PL.OBJ.IND 
‘The dog killed the people.’ 


b. Qimmiq (inun-nik) tuqut-si-vuq. 
dog(ABs) person-INst.PL — kill-ANTIP-3SG.IND 
‘The dog killed (people). 
(Fortescue 1984: 86, 206) 


(11.14) | sj — OBJ 


agent patient 


SBJ — (OBLinst) 


agent patient 


e 


Now we might ask whether there is also a patient-backgrounding 
operation that completely removes the patient from the argument structure 
(parallel to how the agent is removed from the argument structure in the 
anticausative). And, indeed, some languages have a valence-changing affix 
whose effect is that the patient cannot be expressed at all. We may call this 
operation deobjective. An example comes from Tzutujil. 


(11.15) a. x-Q-uu-ch'ey 
PST-35G.OBJ-35G.SBJ-hit 
“he hit him’ 
b. x-Q-ch'ey-oon-i 
PST-3SG.SBJ-hit-DEOBJ-PST 
‘he was hitting’ 
(Dayley 1985: 89, 116) 


(11.15b) is an intransitive verb in all respects: it has the suffix -i in addition to 
the prefix x- in the past tense (cf. x-eel-i ‘he went out’, contrasting with x-uu- 
ch’ey in (11.15a) where there is no-i), ithas only a single person-number prefix 
for the subject, and it does not allow a patient to be expressed. However, it is 
unlikely that (11.15b) has a different event structure from (11.15a), because 
it is difficult to conceive of a hitting event without a patient participant — for 
hitting to occur, there must be something that is being hit. In anticausatives, 
agents can be eliminated from the event structure because the ‘cause’ 
element is eliminated: we can think of opening, breaking and similar events 
as occurring either through an external agent or spontaneously, but we 
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cannot easily think of such events as occurring without a patient. Thus, 
the most likely valence-changing effect of the deobjective is that shown in 
(11.16). The crossed linking line above 'patient' means that this semantic 
role cannot be linked to any syntactic function. 


(11.16) | sj — OBJ 


agent patient 


SBJ 
| x 
agent patient 


Thus, patient-backgrounding operations seem to be exclusively function 
changing. 


11.1.4 Agent-adding operations: causatives 


When a new participant is added to a verb, the event structure must be 
enriched as well, so the causative is clearly an event-changing operation. 
Two examples of causative constructions from Japanese are given in 
(11.17b)-(11.18b), and (11.17c)-(11.18c) show the valence-changing rules. 


(11.17) a. Taroo ga ik-u. 


Taro NOM  g0-PRS 
"Taro goes." 
b. Hanako ga Taroo o0 ik-ase-ta. 


Hanako Nom _ Taro ACC = gO-CAUS-PST 
‘Hanako made Taro go.’ 
(Shibatani 1990: 308-10) 


SBJ 


agent 


SBJ — op 


causer agent 


(11.18) a. Taroo ga hon o yom-u. 
Taro NOM book Acc read-Pns 
‘Taro reads a book.’ 


b. Hanako ga Taroo ni hon 0 yom-ase-ta. 
Hanako nom Taro DAT book Acc  read-CAus-PsT 
‘Hanako made Taro read a book.’ 

(Shibatani 1990: 310) 


SB] — OBJ 


agent patient 


SB — IOBJ — OBJ 


causer agent patient 


e 


The semantic change in the event structure is obvious: it consists in 
adding the element ‘cause’ and with it a causer role (e.g. for ‘go’: [co ([A])] © 
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[cause ([D], co ([A]))]). The linking of semantic roles to syntactic functions 
in causatives is complicated because languages cannot simply create a new 
syntactic function for the new role. Instead, causative verbs are made to 
fit into the existing function structures. The agent of an intransitive verb 
becomes an object as in (11.17b-c), but the agent of a transitive verb often 
becomes an indirect object (as in 11.18b-c), especially in languages that do 
not allow two equal objects. 

Causatives are probably the most common type of morphological 
valence-changing operation in the world's languages, but since they happen 
to be rare in Europe, linguists have often paid more attention to the agent- 
backgrounding constructions that are so common and varied in Europe. 


11.1.5 Object-creating operations: applicatives 


The applicative operation creates a completely new object in the function 
structure of the verb or shifts a non-object to the object function. An example 
of the latter kind comes from German, where the productive verbal prefix 
be- can have the effect of turning an indirect object into a direct object. The 
original direct object can be omitted or expressed as an oblique phrase. 
Example (11.19b) is the applicative construction corresponding to (11.19). 


(11.19) a. IKEA liefert dem Nachbar-n die Möbel. 
IKEA delivers the neighbour-pAT the furniture.acc 
‘IKEA delivers furniture to the neighbour.’ 
b. IKEA  be-liefert den Nachbar-n (mit Möbeln). 


IKEA Arrr-delivers the neighbour-acc with furniture 
‘IKEA delivers furniture to the neighbour.” 


c [sB — OBJ — IOBJ 


agent patient recipient 


sBJ (OBumit) OBJ 


agent patient recipient 


e 


This construction is called a recipient applicative because it is the recipient 
that becomes a direct object, and therefore receives accusative case. Cross- 
linguistically, almost all roles apart from the agent can become direct objects 
when an applicative marker is added to the verb. An example of a locative 
applicative from Ainu is shown in (11.20b). 


(11.20) a. A-kor kotan ta sirepa-an. 
1SG-POSS village to arrive-1SG.INTR 
‘I arrived at my village.’ 
b. A-kor kotan a-e-sirepa. 
1SG-POSS village ^ 1sG.rR-APPL-arrive 


‘I arrived at my village.’ 
(Shibatani 1990: 65) 
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c. | SBJ — ADV 


agent location 


SB — OBJ 


agent location 


e 


Ainu has no case marking, but the subject-agreement marker a-, which is 
restricted to transitive verbs, clearly shows that the applicative prefix e- 
creates a direct-object function in the derived verb's function structure. 

However, an applicative may also add an object argument that was not 
in the function structure of the verb before. For example, Chamorro has a 
benefactive applicative, illustrated in (11.21b). 


(11.21) a. Ha hatsa i acho'. 
he.zRG lift ABS stone 
"He lifted the stone.’ 


b. Ha hatsa-yi si Pedro ni | acho’. 
he-ERG lift-APPL ABS Pedro oBL stone 
"He lifted the stone for Pedro.’ 
(Topping 1973: 253) 


Thus, here the applicative adds a new participant (a beneficiary) to the 
argument structure: 


(11.21) c. | s — OBJ 


agent patient 


SB — OBL — OBJ 


agent patient beneficiary 


In the non-applicative construction, the patient is associated with the 
syntactic function of the object (indicated by the absolutive case in (11.21a)), 
but in the applicative construction the new argument of the verb - the 
beneficiary — is linked to the object function, and so it receives absolutive 
case. Correspondingly, the patient is linked to the oblique. 

This means that applicatives can be either function changing (as in (11.19) 
and (11.20)) or event changing (as in (11.21)). One might propose that these 
two subtypes of applicatives should be given different names, but it is in 
fact not so easy to keep them apart. One might argue, for instance, that the 
‘location argument’ of the Ainu verb sirepa is not in fact an argument but 
an adjunct; this would make it event changing as well. Moreover, some 
languages use the same affix for both benefactive and recipient applicatives, 
suggesting that this is indeed the same kind of operation. Thus, a sharp 
distinction between event-changing and function-changing operations can 
be problematic. 


11.1.6 General properties of valence-changing operations 


As we have seen, valence-changing operations primarily affect agents/ 
subjects and patients/direct objects. Other participants can be promoted 
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to object (or occasionally to subject) status, but there are no operations 
that change an oblique to an indirect object, for example. Explaining such 
possible restrictions on valence changing is a matter for syntactic and 
semantic theories of verbal event structure and argument linking, and thus 
largely beyond the scope of this book. 

However, here it still needs to be pointed out that the semantic/syntactic 
contrast between event-changing and function-changing operations shows 
a clear correlation with derivational and inflectional status of the valence- 
changing affixes. Passives and antipassives are primarily inflectional, 
whereas anticausatives, resultatives and causatives are primarily 
derivational. Reflexives and applicatives tend to show mixed behaviour, 
again correlating with their intermediate status with respect to the event- 
changing/function-changing contrast. 

Since inflectional operations may apply after derivational ones, but the 
reverse is not typically true (see Chapter 5), an important consequence of 
this contrast is the prediction that it should be possible to apply a function- 
changing operation to an event-changing operation, but not vice versa. This 
seems generally to be true. For example, in Chichewa the passive suffix 
-idw (a function-changing operation; see (11.5) above) can be attached to a 
benefactive applicative verb in -ir. 


(11.22) a. Chibwe — a-na-phik-ir-idwa nyemba. 
Chibwe  3sc-Psr-cook-APPL-PAss beans 
‘Chibwe was cooked beans for.’ 
(Dubinsky and Simango 1996: 752) 


b. active: benefactive applicative: passive: 


SB] — OBJ 


| | 


agent patient 


SB) — OBJ = OBL, 


| | | 


agent beneficiary patient 


(OBLndi) SBJ OBL 


| | Í 


agent beneficiary patient 


e e 


The reverse ordering is not possible in Chichewa, although it would 
make sense semantically (cf. 11.23a). However, the applicative suffix can 
follow the resultative suffix, as in (11.23b), because the applicative and the 
resultative are both event-changing operations. 


(11.23) a. *Chitseko | chi-na-tsekul-idw-ira Chibwe. 
door 3SG-PST-Open-PASS-APPL Chibwe 
‘The door was opened for Chibwe.’ 


b. Chitseko | chi-na-tseku-k-ira Chibwe. 
door 3SG-PST-Open-RESULT-APPL Chibwe 
‘The door was opened (= in an opened state) for Chibwe.’ 
(Dubinsky and Simango 1996: 757) 
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Valence-changing operations are in many ways syntactic phenomena, and 
in languages where they are signalled by specific morphological patterns, 
they also clearly belong to morphology. However, most of the operations 
that we have seen in this section also occur with no specific formal coding. 
For instance, English has alternations such as (11.24)-(11.25). 


(11.24) a. Iopened the door. 
b. The door opened. 
(11.25) a. I baked a cake for her. 


b. Ibaked her a cake. 


The alternation in (11.24) clearly resembles the anticausative and the 
causative operation, and (11.25) is very much like a benefactive applicative. 
The English alternations are not usually discussed under the heading 
of morphology, but there is really no deep reason why they should not. 
Morphological operations need not be associated with a particular change in 
the pronunciation, as we saw earlier (e.g. Section 3.1.4). When they are not, 
morphologists speak of conversion, and, while this term is mostly applied 
to word-class-changing operations, it could easily be transferred to valence- 
changing operations. Note also that such valence-changing operations may 
vary in productivity, from sporadic to extremely productive, much like 
other morphological processes. 


11.2 Valence in compounding 


When one of the members of a compound takes arguments, this may be 
affected by the compound structure and the result may be a kind of valence 
change. We will look at valence change in three different compound types: 
noun incorporation, V-V compound verbs and synthetic compounds. 


11.2.1 Noun incorporation 


Noun incorporation is the traditional term for N-V compounds with a 
verbal head; we have encountered examples already in Sections 7.1 and 9.1. 
Since verbs typically require arguments, it is natural for incorporated nouns 
to occupy an argument position of the verb. Consider (11.26) from Guaraní. 


(11.26) a. A-jogua-ta petei mba'e. 
iACT-buy-rUT one thing 
‘T will buy something.’ 


b. A-mba'e-jogua-ta. 
1ACT-thing-buy-FUT 
‘TII go shopping.’ 
Lit: TIl thing-buy.’ 
(Velázquez-Castillo 1996: 107) 
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In incorporated structures like (11.26b), the dependent noun is clearly part 
of the compound verb, and not a separate word. For instance, as shown 
in (11.27), the incorporated noun cannot be modified by an adjective. (The 
different placement of the future marker in (11.26) and (11.27) is not directly 
relevant here.) This indicates that the dependent noun is not an independent 
word, but rather a member of the compound. (See Section 9.1 for discussion 
of compounding tests.) 


(11.27) *A-ha-ta a-mba'e-hepy-jogua 
lact-go-ruT  lacr-thing-expensive-buy 
‘TIl go shopping for expensive items.’ 
Lit: ‘I will go expensive-thing-buying. 
(Velázquez-Castillo 1996: 108) 


Now, the interesting thing here is that in (11.26b) the noun is also the patient 
of the verbal action. This raises the question: In noun incorporation, is 
specification of valence necessarily part of the compounding rule? 

One possibility would be to simply say that the semantic relation 
between the head verb and the dependent noun is vague, as in English 
N-N compounds (e.g. lipstick). The patient interpretation (‘buy things’) 
would then be a natural implicature, but not strictly speaking part of the 
compound verb's meaning. If this is so, we would expect incorporated 
nouns to be able to fulfil other semantic roles besides the patient, and this is 
indeed possible in quite a few languages with noun incorporation. Example 
(11.28) is from Huauhtla Nahuatl. 


(11.28) Ya” — ki-kocéillo-tete'ki panci. 
he 3sc.ogj-knife-cut bread 
‘He cut the bread with the knife.’ 
(Merlan 1976: 185) 


Here, the incorporated noun -kocCillo- ‘knife’ is the instrument of the action. 
Thus, it may be that the noun incorporation rule in these languages does 
not affect the syntax of the verb at all, and that the valence change is only 
apparent. 

Still, languages seem to differ in this respect. In many languages there 
is clear morphosyntactic evidence for a valence-changing effect of noun 
incorporation. This is the case, for example, in Ainu, which has different 
subject-agreement affixes in transitive and intransitive sentences, as we saw 
earlier in (11.20) (e.g. -an for first person singular intransitive, a- for first 
person singular transitive). 


(11.29) a. Inaw . a-ke. 
inaw 1SG.TR-make 
‘I make an inaw (a wooden prayer symbol). 
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b. Inaw-ke-an. 
inaw-make-1SG.INTR 
‘I make an inaw.' 
(Shibatani 1990: 11, 28) 


In contrast to the transitive simple verb ke ‘make’, the compound verb 
inaw-ke ‘make an inaw' is intransitive, as is clearly seen in the choice of the 
subject affix. Thus, it is not sufficient to say that the patient interpretation 
in (11.29b) arises as a pragmatic implicature — here, it must be part of the 
compounding rule, which can be formulated as in (11.30). 


(11.30) | /X/N /Y/v /XY/v 
SBJ — OBJ SBJ 
&| | | Je | 
agenti patientj agenti 
bd “Ai acts on Bj’ ‘Ai acts on x’ 


As this rule shows, the patient variable of the semantic structure of the 
simple verb is filled by the meaning of the incorporated noun, so that the 
semantic structure of the compound verb contains only a single variable 
and hence only a single argument. As in the case of the reflexive voice 
(Section 11.1.2), we have here a borderline case between event changing 
and function changing. 


11.2.2 V-V compound verbs 


A compound type that is rarely found in European languages, but that is 
very interesting from the point of view of valence, is V-V compounding. 
Two well-known languages in which such compounds are common are 
Chinese and Japanese. 

The simplest and least problematic case involves two verbs with the same 
argument structure - e.g. Japanese ukare-sawagu [make.merry-be.noisy] 
‘go on a spree’, Mandarin Chinese tang-huai [iron-break] ‘ruin by ironing’. 
Example (11.31) shows how the Chinese verb is used. 


(11.31) Meimei tang-huai le | nei jian xin yi. 
sister — iron-break PrF that cLF new clothes 
‘Sister ruined those new clothes by ironing them.’ 
(C. H. Chang 1998: 82) 


The rule for Chinese tang-huai could be formulated as in (11.32).* 


^ Here we use a simplified representation because we are only interested in the way 


that elements of argument structure combine in compounding. Naturally, in a full 
representation, function structure and links between it and argument structure would 
also be specified. 
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(11.32) | /X/v /Y/v /XY/v 
agenti patientj | & | agenti patientj | <> | agenti patient} 
‘Ai actsx on Bj’ ‘Ai actsy on Bj’ ‘Ai actsx and actsy 
on Bj' 


However, both Chinese and Japanese allow verbs with different argument 
structures to be compounded as well. In Japanese, where compounds are 
usually right-headed, it is mostly the second verb that determines the 
argument structure of the compound. An example is given in (11.33), and 
the correspondence is shown in (11.34). 


(11.33) Sono booru wa sora takaku ("Jon ^ niyotte) uchi-agat-ta. 
the ball tor sky high John by hit-go.up-PsT 
‘The ball was hit high up in the air (by John).’ 
(Matsumoto 1996: 204) 


(11.34) | /uchi/v /agaru/v /uchiagaru/v 
agenti patientj | & | themei directionk | €» | themej directionk 
‘Ai hits Bj’ ‘Aj goes up to Bk’ ‘Aj is hit upwards 
to Bk’ 


In uchi-agaru [hit-go.up] ‘be hit high up in the air’, the first verb is transitive 
and the second is intransitive (with an additional direction argument), and 
the theme of the head verb is identified with the patient of the dependent 
verb. The agent of the dependent verb completely disappears from the 
argument structure, as is shown by the fact that it cannot be expressed as a 
kind of passive agent. The head verb agaru contributes its arguments to the 
compound. 

The association of intransitive theme and transitive patient is very 
natural (both of these semantic roles are affected by the processes in which 
they are involved), but an intransitive theme may also be identified with an 
intransitive agent: 


(11.35) a. Japanese 
hataraki-tsukareru [work-get.tired] ‘get tired from working’ 
tatakai-yabureru _[battle-lose] ‘lose as a result of fighting’ 
(Matsumoto 1996: 204) 


b. Chinese 
zou-lei [walk-get.tired] ‘get tired from walking’ 
xiao-jiang [laugh-stiff] ‘be stiff from laughing’ 


(C. H. Chang 1998: 83) 
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(11.36) |/hataraki/v /tsukareru/v / hataraki-tsukareru/ v 
agenti & |themei © |themei 
‘Ai works’ ‘Ai gets tired’ ‘Ai gets tired from Ai working’ 


Perhaps the most interesting type of V-V compound is the argument- 
mixing type, in which the compound verb’s argument structure has 
arguments from both constituent verbs. An example is Japanese mochi-kaeru 
[have-return] ‘bring back’ — see (11.37) and the correspondence in (11.38). 


(11.37) Jon | wa | kamera o ie ni mochi-kaet-ta. 
John Tor camera acc house to  have-return-PsT 
‘John brought the camera back home.’ 
(Matsumoto 1996: 208) 


(11.38) 

/ mochi/v /kaeru/v / mochikaeru/v 
possessori themej| & lagenti ^ directionk|O | agenti themej directionk 
‘Ai has Bj’ ‘Ai returns to Ck’ ‘Ai brings Bj back to Ck’ 


In this compound verb, all arguments of the constituent verbs end up as 
arguments of the compound verb. In this way, mochikaeru contrasts with 
uchiagaru. Clearly, Japanese V-V compounding consists of different sub- 
rules in which the argument linking is crucially different. 

The final case to be mentioned here is the ambiguous type represented by 
Chinese qi-lei [ride-tired]. This can mean two different things: 


(11.39) Zhangsan ^ qi-lei le ma. 
Zhangsan  ride-tired prv horse 
a. ‘Zhangsan was tired from riding horses.’ 
b. ‘The horse was tired from Zhangsan’s riding/Zhangsan rode 
the horse tired.’ 
(C. H. Chang 1998: 82) 


Thus, here the theme argument of lei ‘tired’ can be identified either with the 
agent or with the patient of qi ‘ride’. 


11.2.3 Synthetic nominal compounds 


A compound whose head is a deverbal noun and whose dependent 
member fills an argument position in the head's valence is often called a 
synthetic compound. Examples from English include N-N compounds of 
the following type: 
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(11.40) a. truck driver b. whale hunting 


pipe smoker rat poisoning 
air-cleaner duck-shooting 
fire-fighter meter-feeding 


The deverbal head noun inherits the verb’s valence requirements. Thus, the 
noun driver can be analyzed as taking a patient argument, like its base verb 
drive, and this argument is filled by truck, the dependent member of the 
compound. Likewise, the noun hunting can be analyzed as taking a patient 
argument, which is filled by whale. (Argument inheritance is discussed in 
Section 11.3 below.) 

There are at least three different ways in which the semantic relationship 
between the head and the dependent member in a synthetic compound 
could be described. The first approach derives argument interpretation 
from a constituent structure that is different from that of ordinary N-N 
compounds. Consider the compounds in (11.41). 


(11.41) a. race driver b. dog hunting 
chain smoker food poisoning 
vacuum cleaner gang shooting 
freedom fighter breast feeding 


(some examples from Oshita 1995: 183, 189) 


While formally similar to the examples in (11.40), these are not synthetic 
compounds because the dependent member does not fill an argument of 
the head. A chain smoker does not smoke chains (cf. pipe-smoker), a freedom 
fighter does not fight freedom (cf. fire-fighter), and food poisoning does not 
involve poisoning food (cf. rat poisoning). In this approach, the structure of 
pipe-smoker would be as in (11.42a), contrasting with that of chain smoker in 


(11.42b). 


(11.42) a. b. 
y pw 
V Nsuff N N 
oN /\ 
| V. Nsuff 
pipe smoke -er chain smoke -er 


An argument in favour of having different structures for synthetic and 
non-synthetic compounds is the fact that, in more complex compounds, 
a dependent noun that is interpreted as an argument must be closest to 
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the deverbal head: chain pipe smoker, but *pipe chain smoker. The constituent 
structure analysis provides a natural account of the ungrammaticality of 
the second example. 

However, the approach also makes some incorrect predictions. If synthetic 
compounds have the constituent structure [[N V], -er],, it is difficult to 
explain the systematic lack of N-V compounds in English: pipe-smoker, 
but *to pipe-smoke; hat-seller, but *to hat-sell; churchgoer, but *to churchgo; cab- 
driving, but *to cab-drive. And the few instances of N-V compounding that do 
exist seem to have been created by back-formation from the corresponding 
synthetic compounds, not directly as N-V compounds. For instance, to 
skydive is a backformation from skydiving and skydiver. 

The second approach to synthetic compounds involves a special rule 
of argument linking, analogous to the incorporation rule above in Section 
11.2.1 (ex (11.30)). Let us assume that the noun hunting has the argument 
structure [agent patient], just like the verb hunt, and the function structure 
[PossEssoRof — OBLIQUEby] (e.g. hunting of whales by traditional fishermen). 
Then the compound whale hunting eliminates the patient/PossEssORof 
argument and the resulting compound is ‘intransitive’ — i.e. it takes only a 
single OBLIQUEby argument (e.g. whale hunting by traditional fishermen). The 
complete rule is shown in (11.43). 


(11.43) |/X/N /Y /N /XY /N 
POSS — OBLby OBLby 
& o 
patientj agenti agenti 
d ‘event of Aj actingy ‘event of Aj actingy 
on Bj’ on x’ 


This approach is less radical than the first approach in that it does not 
assume a completely different compounding structure for synthetic 
compounds. (11.43) is an instantiation of the general English compounding 
rule (3.30), being merely more specific in that it specifies what happens to 
the arguments and the syntactic functions. This seems necessary, at least for 
action nouns like hunting, because the possibilities of associating semantic 
roles and syntactic functions are severely restricted (for instance, we cannot 
have “fisherman hunting of whales). 

Finally, the simplest approach is to deny that any special rule is needed 
at all. In this view, compounds like truck-driver and whale hunting are 
described as ordinary N1-N2 compounds that do not mean more than ‘N2 
that has some relation to N1’. This is the approach that we took in Section 
7.1 for other types of compounds (consider the semantic relationship of the 
dependent to the head in lipstick vs. sea bird vs. swansong). In truck-driver, 
this meaning (‘driver who has some relation to a truck’) is then naturally 
interpreted as ‘driver who drives a truck’ by a pragmatic implicature. 
Similarly, whale hunting really means only ‘hunting that has some relation to 
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whales’, but a natural pragmatic implicature gives rise to the interpretation 
‘hunting in which whales are hunted’. This analysis does not imply that 
individual compounds cannot acquire the argument interpretation. But it 
does mean that unlike in the second approach, there would be no general 
rule to account for argument interpretation in synthetic compounds. 

This approach has the advantage of being able to capture similarities 
to other types of compounds. We can think of synthetic compounds as a 
special type of what are called affix compounds here. Affix compounds are 
patterns that consist of more than one stem plus an affix. In addition to 
the examples in (11.41) above (chain smoker, food poisoning), English also has 
affix compounds of the following type: 


(11.44) green-eyed ‘having green eyes’ 
dark-haired ‘having dark hair’ 
red-roofed ‘having a red roof’ 


Given that in affix compounds, the affix often attaches to a base that 
is not itself a compound (*a green-eye, *to pipe-smoke, *to race-drive), the 
descriptions in (11.45) seem preferable to any kind of description that relies 
on hierarchical constituent structure for semantic interpretation. (Note that 
it is possible to say that someone has a green eye, but this is a phrase, not a 
compound - there is primary stress on both lexemes.) 


(11.45) a. |/X/A| & J/Y/N| © |/XYd/A 
x. "y ‘having (a) y(s) with 
the property x’ 
b. |/X/N | & |/Y/v /XYer/n 
‘x y € |‘A person who does y, 
having to do with x’ 


This approach utilizes the general rule of English compounding, but unlike 
the second approach, it leaves interpretation of the relationship between 
the head and the dependent to pragmatic implicature. This allows for a 
unified description of the formal similarities between synthetic and other 
types of affix compounds. Of course, not directly specifying the semantic 
relationship between the head and the dependent in synthetic compounds 
also creates the potential for overprediction. This approach has difficulty 
accounting for semantic restrictions of the kind “fisherman hunting of whales. 

Thus, there are arguments for all three approaches, and it is possible 
that different approaches are appropriate for different compounds or for 
different languages (synthetic compounding has been most intensively 
discussed for English). Other things being equal, it would of course be 
desirable to have just a single type of rule, but it remains to be seen whether 
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other things are in fact equal. Thus far, no consensus has formed about how 
to formally describe synthetic compounds. 


11.3 Transpositional derivation 


11.3.1 Transposition and argument inheritance 


A derivational process is called transpositional when it changes the 
word-class of the base lexeme. Some typical examples of transpositional 


derivation are shown in (11.46). 


(11.46) a N — V English computer — computerize 
b V 5 N Russian napolnit’ ‘fill’ > napolnenie ‘filling’ 
c A — V Basque luze ‘large’ — luza-tu ‘lengthen’ 
d. V > A Italian mangiare ‘eat’ > mangiabile ‘edible’ 
e A — N Japanese atarashii ‘new’ — atarashisa ‘newness’ 
f N — A Indonesian tahun ‘year’ > tahunan ‘annual’ 


Valence may be affected by transposition when a verb or an adjective is 
transposed into another word-class (non-derived nouns normally cannot 
be said to have a valence potential, so transpositional derivations of nouns 
are hardly relevant here). When a verb such as examine is transposed into 
an action noun such as examination, its basic meaning (referring to an 
event with an agent and patient) is still intact, but the arguments cannot 
be expressed in the same way as with the base verb. We can say The vet 
examined the pet, but not "[The vet examination the pet]np (took one hour). This 
is because deverbal nouns behave much like ordinary nouns in that they 
do not take subject and object arguments, but only possessor and oblique 
arguments. Thus, we have The examination of the pet by the vet (took one hour). 
The patient argument becomes an of-possessor, and the agent argument 
becomes a by-oblique. The resulting noun phrase is similar to noun phrases 
with non-derived noun heads such as the portrait of Charles V by Titian. The 
relation between the valences of examine and examination can be described 
with our usual notation as in (11.47). 


(11.47) / examine/ y / examination / N 
SB — OBJ (OBLby)— POSSof 

| [| ge e] | 
agent patient agent patient 


In transpositional derivation, when a derived word has a valence that 
corresponds to the valence of the base, we say that the derivative inherits 
the base's valence. 
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In the following subsections, we will take a closer look at various kinds 
of transpositional derivation. 


11.3.2 Action nouns (V — N) 


Perhaps the most interesting type of transpositional derivation is the action 
noun (or event noun) - one that refers to the event or action itself, not to 
a participant of the event — because action nouns show the greatest variety 
of argument structure phenomena both within a language and across 
languages. In English and other European languages, two different types of 
event noun can be distinguished, the simple event noun (e.g. (11.48a)) and 
the complex event noun (e.g. (11.48b)). 


(11.48) a. I have an examination tomorrow. 
b. The vet's careful examination of Fido's eyes took a long time. 
c. The examination is on your desk. 


The basic difference between them is that complex event nouns preserve 
more verbal properties than simple event nouns. Sometimes a third type of 
event noun is distinguished, called concrete noun, and illustrated in (11.48c). 
However, this is not really an event noun, because it does not refer to an 
event. But it is necessary to mention this type in the present context because 
this is a widespread phenomenon: in many languages, the derivational 
patterns used for action nouns can also have concrete meanings. However, 
the kinds of concrete meanings are unpredictable: the product of an action 
(building, painting, judgement, composition), a group of people (management, 
government) or a manner (conjugation). Concrete nouns seem to arise by ill- 
understood and unsystematic (though frequent) processes of metonymic 
meaning shift, not by a word-formation rule, so we need not discuss them 
further. 

Returning to simple and complex event nouns, we note that when the 
verbal arguments are expressed with a complex event noun, it must be 
definite (see (11.49b)) and cannot be pluralized (see (11.49c)). 


(11.49) a. the examination of Fido's eyes by the vet 
b. *an examination of Fido's eyes by the vet 
c. "three examinations of Fido's eyes by the vet 


Simple event nouns are more like ordinary nouns in that they can be 
indefinite or definite (Tomorrow I have an/the examination), and they can be 
pluralized (Tomorrow I have three examinations). Moreover, complex event 
nouns can be modified by duration adverbs like frequent and constant, 
whereas simple event nouns cannot (cf. the frequent examination of Fido's 
eyes/"a frequent examination). But, in the present context, the most important 
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difference between complex event nouns and simple event nouns is that only 
the former inherit the verb's argument structure. Thus, for complex event 
nouns, the function-changing transposition rule in (11.47) is appropriate, 
whereas, in simple event nouns, the argument structure is not preserved. 
As a result, simple event nouns may occur on their own, with no arguments 
expressed, as in (11.50). 


(11.50) a. The examinations took a long time. 
b. Weare witnessing a new development. 
c. The destruction was awful to see. 


By contrast, complex event nouns derived from transitive verbs require the 
overt expression of the patient, while the agent may be optionally present, 


as seen in (11.51). 


(11.51) a. The frequent examination "(of the evidence) (by the scientists) is 


necessary. 

b. The constant development *(of new inexpensive housing) (by the city) 
was applauded. 

c. The continuing destruction "(of rainforests) (by humans) will speed up 
desertification. 


In some languages, complex event nouns have an argument structure 
that is even more verb-like in that the patient is coded as an accusative NP. 
An example comes from Modern Hebrew. 


(11.52) ha-hafcaca ha-tedira Sel ha-cava et ha-ir 
the-bombing  the-frequent of the-army Acc  theccity 
'the army's frequent bombing of the city' 
(Siloni 1997: 170) 


In English, only oblique arguments coded by a PP and clausal arguments 
may be retained in an action noun construction (e.g. they rely on her > 
their reliance on her; they elected Maria as president — their election of Maria as 
president; I predict that it will rain — my prediction that it will rain). 


11.3.3 Agent nouns (V — N) and deverbal adjectives (V — A) 


An agent noun is one that refers to the agent of the action, rather than the 
action itself. In contrast to (complex) event nouns, agent nouns in English 
and in many other languages do not seem to inherit the verb's argument 


5  Anasterisk before an expression in parentheses means that the expression cannot be left 
out. 
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structure. Expressions such as "voter for Mitterrand, “thinker about deep 
problems or *claimer that Armageddon is near are systematically impossible. 
However, it is, of course, possible to have a possessive phrase that serves 
the same purpose as a verbal argument: explorer of Antarctica, founder of Lund 
University, Mitterrand’s voters, and so on. One could see this as evidence that 
to some extent the verbal argument structure may be inherited after all, 
but a simpler account is available: possessive phrases have a very general 
meaning, and often the precise interpretation is left to pragmatic inferences 
from the context, as in the case of compounds (see Section 11.2). Given 
the meaning of an agent carrying out some action, the interpretation of a 
possessive phrase as a patient of that action is readily available, so we do 
not need to say that it arises as a result of argument inheritance. This view 
is also confirmed by the fact that agent nouns, unlike complex event nouns, 
do not admit an agent-oriented adverbial such as a purpose clause: 


(11.53) a. "an explorer of America in order to discover El Dorado 
b. the exploration of America in order to discover El Dorado 


Thus, the rule for deriving an agent noun of a transitive verb would be as in 
(11.54), where the derived noun lacks an argument structure. 


(11.54) | /X/v / Xer/N 
SB} — OBJ 
| | Jo 
agenti patientj 
“Ai actsx on Bj’ ‘person who acts x’ 


(Note that if this is the right analysis, we have to revise what we said about 
synthetic compounds above. The idea that agent nouns do not have an 
argument structure is incompatible with the second approach to describing 
synthetic compounds that was presented in Section 11.2.3 (see (11.43)). 
Perhaps we ought to say that the third approach to synthetic compounds 
outlined in that section, which rests on pragmatic inference, is appropriate for 
agent nouns and other deverbal formations that lack an argument structure, 
whereas the second approach is appropriate for complex event nouns.) 

English deverbal adjectives in -able seem to be similar to agent nouns in 
that they do not generally inherit oblique or clausal arguments from the 
base verb (*convincible of the eventual success, "emptiable of water, *persuadable 
that I’m right, but cf. deductible from income tax). 


11.3.4 Deadjectival transposition (A — N, A — V) 


Adjectives are much less often associated with their own argument and 
function structure, but many languages have at least a few argument- 
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taking adjectives (such as English proud of, full of, similar to, obedient to, 
different from, responsible for, ready to do something). In English, most of these 
oblique arguments are preserved in deadjectival quality nouns (similarity 
to, obedience to, responsibility for, readiness to do something, ?? difference from), 
though in some cases we have idiosyncratic changes (pride in, not "pride of). 

In deadjectival verbs, the oblique argument may also be preserved. The 
examples in (11.55)-(11.56) show that in Russian the adjectival argument 
structure is inherited. The adjective gordyj ‘proud’ takes an instrumental 
oblique argument, and so does the deadjectival verb gordit’sja ‘to pride 
oneself (on)’. The adjective gotovyj ‘ready’ takes an infinitival argument, 
and so does the deadjectival verb gotovit sja ‘get ready’. 


(11.55) a. On gord svoimi dostizenijami. 
he proud  self'siNsT  achievements.iNsT 
'He is proud of his achievements. 


b. On gorditsja svoimi dostizenijami. 
he  proud.3sc.REFL self’s.INST achievements.INST 
‘He prides himself on his achievements.’ 


(11.56) a. On gotov oyexat* iz strany. 
he ready  leaveiNF from country 
‘He is ready to leave the country.’ 
b. On gotovitsja oyexat* iz strany. 
he  get.ready.3sG.REFL leave.INF from country 


‘He is getting ready to leave the country.’ 


A counterexample would be English fill, which does not behave like full (cf. 
full of, fill with). 

A difficulty in determining whether the adjectival argument structure is 
inherited is the fact that the choice of the preposition or oblique case that 
marks the adjectival argument is rarely completely arbitrary. In many cases, 
it could be argued that the choice of the preposition or case is determined 
semantically and is independent of the base adjective. 


11.4 Transpositional inflection 


A particular challenge for morphologists and syntacticians is the description 
of transpositional (word-class-changing) inflection. In transpositional 
inflection, not just some, but all of the argument structure of the base 
is preserved, plus its other combinatory possibilities. An inflectional 
V — A transposition is called a participle in many languages (see (11.57) 


258 CHAPTER 11 MORPHOLOGY AND VALENCE 


from German), and an inflectional V — N transposition is called a masdar 
in some languages (cf. example (11.58) from Lezgian). 


(11.57) der im Wald laut pfeif-end-e Wanderer 
the inthe forest loud whistle-PrCP-M.sG hiker 
‘the hiker who is whistling loud in the forest’ 


(11.58) Wun fad garag-un-i cun tazub iji-zwa. 
you.ABs early  getup-MASD-ERG we.ABS surprise  do-IMPF 
‘That you are getting up early surprises us.’ 
(Haspelmath 1993: 153) 


A less well-known example of word-class-changing inflection is the 
Hungarian proprietive (‘having’, N — A): 


(11.59) rendkívül nagy  hatalm-ú uralkodó 
extremely great power-PROPR monarch 
‘monarch with extremely great power’ 
(Kenesei 1995-96: 164) 


The participle is similar to the deverbal adjective (Section 11.3.3), but note 
that it also inherits the possibility to combine with a locative modifier (im 
Wald ‘in the forest’) and a manner modifier (laut ‘loud’). The masdar is 
similar to the action noun, but it preserves the verbal valence completely: 
in (11.58), the agent argument is in the absolutive case, and in this respect 
it is very different from a noun’s modifier or argument. Moreover, (11.58) 
also shows that the masdar is like a verb, not like an action noun, in that 
it can combine with an adverb (cf. the behaviour of English action nouns: 
*My perusal carefully of the article versus my careful perusal of the article). The 
Hungarian proprietive is similar to denominal adjectives like powerful, 
but, unlike such adjectives in English, Hungarian proprietives can take 
prenominal modifiers that only nouns can take. 

This suggests that, if we want to describe the syntactic behaviour of 
participles, masdars and proprietives (and other inflectional transpositions 
not mentioned here), instead of invoking a mechanism of inheritance from 
the base lexeme, we should say that we do not have a new lexeme here at 
all but an inflected word-form of the same lexeme. Participles and masdars 
are verbs, and Hungarian proprietives are nouns. Combined with their 
dependents (i.e. their arguments and modifiers), they yield verb phrases 
and noun phrases: 
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(11.60) German VP 
PP AdvP V 
a 
P NP 
MM 
im Walde laut pfeif-end 
in.the forest loud whistle-PrCP 


‘whistling loud in the forest’ 


(11.61) Lezgian VP 


NP AdvP V 
wun fad qaragun-i 
yOu.ABS early get.up-MASD 


‘you rising early’ 


(11.62) Hungarian i" 
zs ] 
Adv A 
rendiviil nagy hatalm-ii 
extremely great power-PROPR 


‘having extremely great power’ 


If we want to account for the behaviour with respect to their dependents, 
this description of these constructions is unexceptionable, but now we face 
a paradox: we have just said that participles, masdars and proprietives do 
not change the word-class of their base, although at the beginning of this 
section we said that they were examples of word-class-changing inflection. 
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And, of course, there are good reasons for saying that a participle is an 
adjective. For instance, in German it shows exactly the same agreement 
inflection as adjectives, and it precedes the noun in an NP. There are 
also good reasons for saying that the Lezgian masdar is a noun: it shows 
nominal case inflection and occurs in the same syntactic environment as 
non-derived nouns. The Hungarian proprietive, too, is adjective-like with 
respect to its position and its pluralization. 

A possible solution to this paradox is the following. Participles, masdars 
and proprietives show dual behaviour - they act like verbs, verbs and 
nouns with respect to their dependents, but like adjectives, nouns and 
adjectives with respect to the other elements in the sentence. We conclude 
from this dual behaviour that they have a dual nature: a lexeme word- 
class and a word-form word-class (Haspelmath 1996). As a lexeme, a 
participle is a verb, just like the other verb forms. But, as a word-form, 
a participle is an adjective. The internal syntax of a word is determined 
by its lexeme word-class, and the external syntax of a word is determined by 
its word-form word-class. 

Let us now see how we could describe the external syntax of the phrases 
in (11.60)-(11.62). One possibility would be to assume a structure as in 
(11.63) for the German phrase in (11.57). 


(11.63) NP 
eee rm 
D AP N 
VP 
D E. 
PP Adv V A 
A 
der im Wald laut pfeif- -end-e Wanderer 
the inthe forest loud whistle- — -PrCP-M.sG hiker 


This representation has two disadvantages. First it makes the claim that the 
participle pfeifende belongs to two different syntactic constituents, although 
usually one assumes that a unitary word-form must also be a unitary 
syntactic constituent. Second, it works only for transpositional formations 
that are characterized by affixes. Participles such as Hebrew Sorek ‘whistling’ 
behave just like German pfeifend, but they cannot be represented as in (11.63) 
because they have no participial affix — the participle is signalled by the 
vowel pattern o-e (cf. the past tense Sarak of this verb). 
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An alternative proposal that does not have these disadvantages is to 
indicate the dual word-class membership in the syntactic trees. A participle 
can be represented as a word-syntactic tree as in (11.64a), contrasting with a 
derivational transpositional form such as an agent noun, given in (11.64b). 


(11.64) a. pfeifend ‘whistling’ b. Pfeifer ‘whistler’ 


V A V N 
pfeif end Pfeif er 


In (11.643), the lexeme word-class is given in the inner angled brackets, 
and the word-form word-class is given in the outer angled brackets. Thus, in 
inflectional transposition, properties of the word-class of both constituents 
are preserved. By contrast, in derivational transposition, the derivative has 
primarily the head's word-class properties. 

If such dual-word-class representations are admitted in the syntax, we get 
(11.65), where the phrasal node dominating pfeifend also has dual category 
membership. The notation ‘((V)A)P’ can be read as ‘VP with respect to 
internal syntax, AP with respect to external syntax’. 


(11.65) NP 
D ((V)A)P N 
PP Adv ((V)A) 
der im Wald laut pfeifende Wanderer 


the in.the forest loud  whistleptcp hiker 


The difference between transpositional inflection and transpositional 
derivation is interestingly similar to the difference between event-changing 
and function-changing operations that we saw in Section 11.1. Event- 
changing operations are generally derivational and involve a change in 
the argument structure of the base, like most transpositional derivation. 
Function-changing operations are generally inflectional and involve no 
change in the argument structure of the base, like transpositional inflection. 
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The main difference is that function-changing operations of course change 
syntactic functions, whereas in prototypical transpositional inflection no 
functions are changed. 

Moreover, it should be recalled that the difference between event- 
changing and function-changing operations is not always clear-cut, and we 
often find intermediate cases. Transpositional operations are no different. 
Some inflectional forms do require some limited function changing - e.g. 
English masdar-like expressions of the type Maria’s criticizing Robert, the 
guest’s arriving late, where the verb’s subject is coded not as a subject but as a 
prenominal possessor. On the other hand, derivational formations in some 
languages allow the expression of adverbials. Examples (11.66a—b) are from 
Spanish (adverbials: hoy ‘today’ and todavía más ‘even more’), and example 
(11.67) is from Modern Greek (adverbial: prosektiká ‘carefully’). 


(11.66) a. la inauguración hoy en Barcelona del | Congreso 
the inauguration today in Barcelona of.the Congress 
'the inauguration today in Barcelona of Congress' 


b. la caída de los precios todavía más 
the falling of the prices still more 
'the falling of the prices even more' 
(Rainer 1993: 214) 


(11.67) i  katastrofí ^ ton engrafon prosektika 
the destruction the.cENPL documents.GENPL carefully 
‘the destruction of the documents carefully’ 
(Alexiadou 1999: 19) 


However, this blurring of the boundaries between word-class-changing 
inflection and derivation is not surprising if we remember what we said in 
Chapter 5 — that there is reason to view the boundary between inflection 
and derivation generally as a continuum, rather than a dichotomy. 


Summary of Chapter 11 


The most interesting inflectional values and derivational meanings are 
those that affect the valence of the base: valence-changing operations, 
some types of compounding and transpositional derivation (in 
transpositional inflection, the base’s valence remains unaffected). 
Valence-changing operations may be event changing (i.e. the event 
structure of the base and therefore its argument structure are 
modified) or function changing (i.e. only the function structure of the 
base is modified). The most important valence-changing operations 
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are passive, reflexive, anticausative, resultative, antipassive, causative 
and applicative. In compounds, valence is potentially affected if at least 
one of the bases is a verb (as in incorporation and V-V compounding) 
or a deverbal derivative (as in synthetic nominal compounds). 
Transpositional derivatives such as action nouns and agent nouns 
inherit the base’s valence toa greater or lesser extent. In transpositional 
inflection, the base’s valence is completely preserved, but, in order to 
arrive at a coherent description, one needs to differentiate between a 
word’s lexeme word-class and word-form word-class. 


Further reading 


For syntactic theories that are deeply concerned with semantic valence 
(argument structure) and syntactic valence (function structure), see Dik 
(1997), Van Valin and LaPolla (1997) and Bresnan (2001). 

Passive morphology is discussed in Haspelmath (1990). For antipassives, 
see Cooreman (1994), for resultatives, Nedjalkov (1988), and for causatives, 
Dixon (2000). 

An overview of noun incorporation is given in Mithun (1984), and see 
Mithun and Corbett (1999) for noun incorporation and valence. 

Japanese compound verbs are discussed in Matsumoto (1996); for Chinese 
compounds, see Packard (2000). Synthetic compounds are discussed 
lucidly in Oshita (1995). For action nouns, see Koptjevskaja-Tamm (1993) 
and Grimshaw (1990). 

Transpositional inflection is discussed in Haspelmath (1996). 


Comprehension exercises 


1. Formulate the morphological rule for adjectives of the type supportive of 
(derived from the verb support) (cf. Sections 11.3.3-11.3.4), analogous to 
the rule in (11.47). 


2. English has one kind of verbal valence-changing prefix that can be 
regarded as an applicative marker, the prefix out-, as in 


run outrun 
play outplay 
shine outshine 


Formulate the rule for out-, stating how the function structure, the 
argument structure and the meaning are affected. 
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3. The phrase ruler over a large empire is accepted by many speakers 
of English. Which generalization of this chapter is the phrase a 
counterexample to? 


Frequency effects 
in morphology 


he ways in which speakers use language have a profound influence on 

language structure, and frequency is one of the most important sources 
for system-external explanation of language structure. In fact, we have 
already seen examples in this book of frequency affecting the content of the 
lexicon (Section 4.3), productivity (Sections 6.4.1-6.4.2) and word-class shift 
(Chapter 8). In this chapter we explore how frequency matters for language 
structure, and why. Frequency influences word structure in many ways, 
but one of the most striking effects is found in inflection, where frequency 
asymmetries result in asymmetrical structural behaviour of various kinds. 
We look at some examples of this interaction. 


12.1 Asymmetries in inflectional values 


In inflectional systems, we often observe asymmetries in the behaviour of 
inflectional values that belong to the same inflectional feature, including 
number (singular versus plural), case (nominative versus accusative), voice 
(active versus passive) and polarity (affirmative versus negative). 


12.1.1 Frequent and rare values 


Frequency differences among some common inflectional values are 
summarized in Table 12.1, where ‘>’ means ‘is more frequent than’. It should 
be noted that not every word in a language will exhibit these frequency 
asymmetries. These generalizations should instead be taken as describing 
the overall pattern of a language. Also, not all languages have inflection 
for all these features, but the claim is that, when a language has inflection 
for one of these features and values, it will conform to the generalization 
expressed in the table. 
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Feature Values, ordered by frequency 
number singular » plural » dual 

case nominative > accusative > dative 
person 3rd > non-3rd (1st/2nd) 

degree positive > comparative > superlative 
voice active > passive 

mood indicative > subjunctive 

polarity affirmative > negative 

tense present > future 


Table 12.1 Frequent and rare values 


The generalizations in Table 12.1 can be illustrated by examining the results 
produced by counting usage of number values in six languages: 


(12.1) Singular Plural Dual Number of nouns 
French 74.3% 25.7% 1,000 
Latin 85.2% 14.8% 8,342 
Russian 77.7% 22.3% 8,194 
Sanskrit 70.3% 25.1% 4.6% | 93277 
Slovene 72.596 26.9% 0.6% 11,711 
Upper Sorbian 64% 30% 6% unknown 


(Corbett 2000: 281; data partly from Greenberg 1966: 31-2) 


The differences between languages that we see here could be due to slight 
differences in the meanings of the number values, or they might simply 
be due to the genre or style of the text chosen. Ideally, the token frequency 
of inflectional values should be counted in a text that is representative 
of the everyday spoken language in the community, and finding such 
representative texts is not straightforward. But, fortunately, the asymmetries 
in the usage of number values are so robust that the same result is generally 
obtained, no matter which texts we look at. So we can safely say that the 
singular has a higher frequency than the plural. 

Why should such frequency asymmetries exist? To start with, the 
nominative can be expected to be more frequent than the accusative, at 
least in languages that do not allow unexpressed arguments, because all 
verbs require a nominative argument (i.e. a subject), but only transitive verbs 
also have an accusative. Similarly, the subjunctive must be rarer than the 
indicative because subjunctives are used primarily in subordinate clauses, 
and a subordinate clause presupposes a main clause with, at least typically, 
an indicative verb. However, the ultimate reason for the different frequencies 
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of different inflectional values is outside language. Some expressions are 
more frequent simply because humans find them more relevant: we all 
talk more about singular entities than about plural entities, more about 
third persons and things than about speech act participants (first/second 
person), more about present events than about future events, and so on. 
The linguist has no privileged skills for explaining these preferences, so we 
will not discuss them further. Instead, we will focus on structural properties 
that correlate with frequency. 


12.1.2 The correlation between frequency and shortness 


Quite generally, frequent expressions tend to be short in human languages. 
Frequent words are shorter than rare words. For example, in French the 10 
most frequent words are de, le, la, et, les, des, est, un, une, du, and long words 
like éléphant or questionnaire are used rarely. 

Even more strikingly, frequently used inflectional values may not be 
expressed overtly at all but are left to be inferred from the context — i.e. they 
sometimes show zero expression. This is just one more manifestation of the 
correlation between frequency and shortness. As an example, consider the 
partial inflectional paradigm of regular nouns in Udmurt, given in (12.2). 


(12.2) SINGULAR PLURAL 
NOMINATIVE val valjos "horse(s) 
ACCUSATIVE valez valjosty ‘horse(s) (dir. obj.) 
ABLATIVE valle$ valjosles ‘from the horse(s)’ 
ABESSIVE valtek valjostek ‘without the horse(s)’ 


(PerevoScikov 1962: 86-7) 


In this paradigm, the more rarely used cases, ablative and abessive, have a 
longer form than the more frequently used accusative. The nominative and 
the singular are the shortest: they are both expressed by zero. This Udmurt 
paradigm is quite typical of inflectional systems. Zero expression is found 
in frequent values, and when two contrasting values are both overtly coded, 
typically the more frequently used value has the shorter expression. Two 
more examples from verbal inflection are given in (12.3) and (12.4). 


(12.3) Tzutujil COMPLETIVE INCOMPLETIVE POTENTIAL 
19G x-in-wari n-in-wari xk-in-wari 
29G x-at-wari n-at-wari xk-at-wari 
38G x-wari n-wari xti-wari 
1PL x-oq-wari n-oq-wari xq-oo-wari 
2PL x-x-wari n-ix-wari xk-ix-wari 
3PL x-ee-wari n-ee-wari xk-ee-wari 


(Dayley 1985: 87-8) 
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(12.4) Kobon PRESENT FUTURE CONDITIONAL 

ISG ar-ab-in ar-nab-in ar-bnep 

2SG ar-ab-ön ar-nab-ön ar-bnap 

3SG ar-ab ar-nab ar-böp 

1DU ar-ab-ul ar-nab-ul ar-blop 

2/3DU ar-ab-il ar-nab-il ar-blep 

1PL ar-ab-un ar-nab-un ar-bnop 

2PL ar-ab-im ar-nab-im ar-bep 

3PL ar-ab-ól ar-nab-ól ar-blap 


(Davies 1981: 166, 181) 


Both these paradigms show zero expression in the third person singular. 
The Tzutujil paradigm shows that the non-indicative form (called 
‘potential’) has a longer marker than the indicative forms, and the Kobon 
paradigm shows a longer marker for future tense than for present tense. 
The conditional mood in Kobon is marked by the two consonants b and 
p, so it is longer than the present indicative form, which has just a single 
consonant (this assumes that consonants are more important in counting 
length than vowels). 


12.1.3 The correlation between frequency and differentiation 


Inthree differentsenses, frequently used values tend tobe more differentiated 
than rarely used values.' First, frequent values show less syncretism than 
rare values. Consider the partial paradigm of the Old English verb bindan 
‘bind’ in (12.5). 


(12.5) PRESENT IND PRESENT SBJV PASTIND PAST SBJV 
1 sc binde binde band bunde 
2 SG bintst binde bunde bunde 
3 sc bint binde band bunde 
1-3 PL bindap binden bundon bunden 


This paradigm shows that there is more syncretism in the plural than in 
the singular (in fact, all plural forms of all verbs are syncretized in Old 
English), more syncretism in the subjunctive than in the indicative, and 
more syncretism in the past indicative than in the present indicative. The 
same tendency is found in Khanty possessive suffixes: 


!  Generalizations about a correlation between frequency of use and linguistic structure are 


based on strong tendencies, but should not be treated as inviolable principles. For every 
generalization there are counterexamples. 
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(12.6) SINGULAR PLURAL DUAL 
IST -êm -wW -emon 
2ND -ên -lan -lan 
3RD -l -el -lan 


(Nikolaeva 1999: 14) 


Here, syncretism is found in the rarest of the three number values, the dual, 
and in one of the rarer person values, second person. (More syncretism in 
the dual can also be seen in Kobon (see (12.4) above)). More syncretism in 
the passive than in the active voice can be exemplified from Gothic (niman 
‘take’). 


(12.7) ACTIVE PASSIVE 
SINGULAR PLURAL SINGULAR PLURAL 
1sT nima nimam nimada nimanda 
2ND nimis nimiþ nimaza nimanda 
3RD nimiþ nimand nimada nimanda 


The active has five different shapes, and the passive has only three. 

The second sense in which frequent values are more differentiated is that 
inflection classes differ primarily with respect to the frequent values, less 
so with respect to rare values. In other words, the classes have fewer shared 
exponents in the frequently used values. This can be seen in Russian noun 
inflection. The endings of the four most important Russian inflection classes 
are shown in (12.8) (the inflection classes are labelled I-IV, as is traditional. 
See Exercise 5 of Chapter 8 for full word-forms belonging to each of these 
classes). 


(12.8) 
SINGULAR PLURAL 
IV I IIT II IV I III II 
NOM -a 
-0 Ø -a -i 
ACC -u 
GEN -A -i Ø | -ov | -ej Ø 
DAT -u l -am 
-i -e 
LOC -e -ax 
INSTR -om ju | -oj -ami 


The contrast between singular and plural is clear: in the singular, there are 
at least twelve distinct endings, while in the plural there are at most eight. 
And, at least in the plural, the rarer cases (dative, locative, instrumental) 
show fewer allomorphs than the more frequent cases. Likewise in Standard 
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Arabic, transitive verbs belong to one of four inflection classes, characterized 
by different vowels before the final stem consonant. However, in the rarer 
passive voice the inflection is uniform and the difference between the 
inflection classes disappears (see (12.9)). 


(12.9) ACTIVE PASSIVE 
PERFECT IMPERFECT PERFECT IMPERFECT 
a-u: qatala yaqtulu i-a: qutila yugtalu ‘kill’ 
a-i: daraba yadribu i-a: duriba yudrabu ‘hit’ 
i-a: hafiza yahfazu i-a: hufiza yuhfazu ‘protect’ 
a-a: jamafa yajmafu i-a: jumifa yujmafu ‘gather’ 


The third sense in which frequently used values are more differentiated 
is that they tend to show more cross-cutting values. For example, as we saw 
in Section 5.1, the Latin future tense lacks a subjunctive mood (or one could 
alternatively say that the subjunctive mood lacks a future tense). In (12.10), 
we again see the third person singular of the verb laudare ‘praise’. 


(12.10) PRESENT TENSE PAST TENSE FUTURE TENSE 
INDICATIVE laudat laudabat laudabit 
SUBJUNCTIVE laudet laudaret — 


Lack of cross-cutting values is similar, but not identical to syncretism. In 
Latin, the distinction between indicative and subjunctive is not neutralized 
in the future tense. The form laudabit (‘she will praise’) expresses only the 
indicative, and future tense cannot be expressed directly in the subjunctive. 


12.1.4 Local frequency reversals 


Table 12.1 shows the frequency asymmetries that hold in general in 
languages. However, in particular lexemes, the frequency relations may be 
reversed. For instance, while most nouns (such as ‘table’, ‘head’ or ‘doctor’) 
occur more often in the singular than in the plural, a small group of nouns 
tend to occur more often in the plural in many, if not all, languages. These 
are nouns referring to some paired or multiple body parts ('eyes', ‘lips’, 
‘hair(s)’), small animals (‘ants’, ‘fish’, ‘mice’), small parts of plants (‘beans’, 
‘strawberries’, ‘leaves’), and some others (‘sand grains’, ‘splinters’). 

In the case feature, nouns that denote a place occur in the locative case 
more often than in the nominative, in contrast to other nouns. And, while 
the greater relative frequency of the nominative case is clearly true of 
animate nouns that may occur as subjects of transitive clauses, it is not so 
clear that inanimate nouns, which are typically patients, are also used more 
frequently in the nominative than in the accusative case. 

Local frequency reversals may also be found in particular cross-cutting 
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values. While in general the third person is more frequent than the second 
person, in the imperative mood this relation is reversed: commands are 
more often addressed to the person who is supposed to carry them out, 
and indirect imperatives (with the subject in the third person) are rare in 
all languages. 

Structural effects of these frequency reversals can be observed in 
many languages. In Welsh, plurals are normally marked by suffixes as in 
other Indo-European languages (cath/cathod 'cat/cats', draenog/draenogod 
‘hedgehog/hedgehogs’; see (8.10) for more examples). However, in certain 
nouns that are used frequently in the plural, itis the singular that is marked 
by a special suffix: 


(12-11) dail ‘leaves’ deilen ‘leaf’ 
pysgod ‘fish (PL)’ pysgodyn ‘fish (sG) 
ffa ‘beans’ ffüen ‘bean’ 
cacwn ‘wasps’ cacynen ‘wasp’ 
mefus ‘strawberries’ mefusen ‘strawberry’ 
tywys ‘corn’ tywysen ‘ear of corn’ 


(King 1993: 67-9) 


There are also languages in which case marking is found only in inanimate 
nouns (or non-personal pronouns). Godoberi is such a language, and here 
it is the (transitive) subject case that is overtly marked, whereas the direct- 
object case is zero (12.12)? Presumably, this is connected to the fact that 
inanimate nouns are more likely to be used as objects than as subjects. 


(12.12) (transitive) subject case — den-Ó T hanqu-di ‘house’ 
direct-object case den-O ‘me’ . hanqu-O ‘house’ 
(Kibrik 1996: 119, 36) 


And in the imperative, the second person form is often zero while the third 
person form is overtly marked (e.g. Latin second person imperative lauda 
‘praise!’, third person imperative laudato ‘let him/her praise!’). 

Local frequency reversals occur in derivational morphology as well. In 
many languages female person nouns are derived by a special affix from 
the corresponding male or general person noun - e.g. Dutch handelaar 
'(male) merchant’, handelaarster ‘female merchant’, Hausa àbookii '(male) 
friend’, abookiyaa ‘female friend’. From the point of view of the semantics, it 
would be equally possible to have a special affix that denotes male persons, 
but such affixes seem to be extremely rare. One reason for this asymmetry 
is probably that, in most societies, men tended to have more specialized 
?  [nstead of the familiar terms ‘nominative/accusative’, the terms subject case/object case 


are used here, because overtly marked subject cases are usually called 'ergative' rather 
than ‘nominative’. 
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roles, so that at least person nouns that denote professions and occupations 
are more frequently applied to men. Thus, the direction of derivation 
(trom male/general to female) is related to frequency of use. However, the 
frequency relations tend to be reversed with nouns like nurse (because more 
women are nurses than men) and widow (probably not because husbands 
die more often than wives, but because marital status has traditionally 
been considered more relevant for women than for men). As a result of 
the unusual frequency relations, we get unusual male forms with overt 
marking (widow-er, male nurse). 


12.1.5 Explaining the correlations 


The correlation between frequency and shortness is clearly motivated by 
language users’ preference for economical structures. Speakers can afford 
shorter expressions (or even zero expressions) when these are frequent, 
because frequent expressions are more predictable and are therefore those 
that are expected by default. The basic principle here is the same as in many 
other areas of human communication. For instance, in many countries local 
phone calls do not require an area code because phone calls to the local area 
are more common than phone calls to other areas. 

In language, such economical structures may arise when a new distinction 
is introduced that is coded only in one of the two contrasting values. For 
instance, Spanish has a new subject/object distinction, which is marked 
by the preposition a with animate NPs (e.g. Veo a mi hermano [see.1sc to 
my brother] ‘I see my brother’). This does not have morphological status 
yet, but if it becomes grammaticalized as an accusative case prefix, we will 
have a case system that conforms to the pattern in (12.12), in which the 
less frequently used case form gets the overt marking. The nominative was 
never marked overtly from the beginning of this change. 

Another way in which an economical case-marking system may arise 
is by selectively preserving older markers. For example, in the Old High 
German n-declension, animate and inanimate nouns alike had a distinction 
between nominative and accusative (see (12.13)). 


(12.13) Old High German Modern German 
NOM.sG affo knoto Affe | Knoten 
ACC.sG affon ^ knoton Affen Knoten 

‘ape’ ‘knot’ ‘ape’ ‘knot’ 


Then the nominative/accusative distinction was lost in inanimate nouns, 
and in Modern German only animates preserve the zero marking in the 
nominative. Again, the resulting pattern conforms to (12.12), but it has 
come into existence via a different diachronic route. 

The correlation between frequency and differentiation is due to the 
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greater memory strength of frequent values. When a value occurs rarely, it 
is more difficult to remember all the details of that value, so syncretism is 
more common in rare values, and various suppletive allomorphs are more 
easily kept apart in the frequent values. 


12.2 The direction of analogical levelling 


Analogical levelling is a common type of morphological change. Levelling 
eliminates morphophonological alternations by extending one stem 
alternant to other word-forms in the paradigm. For instance, many speakers 
of English have eliminated the alternation in house/houses, which in the 
traditional pronunciation has a voiced final stem consonant in the plural: 
[haus]/[hauzoz]. Now crucially, it is the form of the singular stem that is 
extended by the innovating speakers ([haus]/ [hausez]), not the plural stem. 
There are no English speakers that pronounce the singular noun house as 
[hauz]. 

This change is typical of analogical levelling in general: the form of the 
stem that is extended within the paradigm is usually the value with higher 
frequency. That frequency is the crucial factor is particularly clear from 
cases of local frequency reversals. A particularly striking case of this comes 
from West Frisian, where in the traditional language many nouns show a 
vowel alternation in singular-plural pairs. In innovative varieties of the 
language, this alternation is eliminated and the singular and plural stems 
are identical again, (see (12.14)). 


(12.14) conservative innovative 

a. hoer/hworren hoer/hoeren ‘whore(s)’ 
koal/kwallen koal/koalen ‘coal’ 
miel/mjillen miel/mielen ^meal, milking' 
poel/pwollen poel/poelen ‘pool(s)’ 

b. earm/jermen jerm/jermen ‘arm(s)’ 
kies/kjizzen kjizze/kjizzen ‘tooth/teeth’ 
hoarn/hwarnen hwarne/hwarnen ‘horn(s)’ 
trien/trjinnen trjin/trjinnen ‘tear(s)’ 


(Tiersma 1982: 834) 


In (12.14a), the singular stem form is extended in analogical levelling, but, 
in (12.14b), the plural stem form is extended. The choice of the form that 
is extended is by no means arbitrary: when the noun denotes a thing that 
tends to occur in groups and hence is more frequent in the plural, the plural 
stem wins out. 

An example from case inflection is Latin oleum ‘olive tree’, which goes 
back to an earlier form oleivum (cf. oleiva, later oliva, ‘fruit of the olive tree, 
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olive’). Then three sound changes occurred: (i) the diphthong ei turned into 
éand later into 7, (ii) the semivowel v [w] was dropped before u and (iii) long 
vowels were shortened before another vowel. As a result, the nominative/ 
accusative form oleivum successively became olévum, oleum and oleum, 
whereas the genitive and dative forms oleivi/oleivo became olivi/olivo. Then, 
analogical levelling extended the nominative/accusative stem to the other 
case forms (oleiva became oliva and retained the stem oliv-, because the v 
never dropped from its paradigm): 


(12.15) oldest form later form Classical Latin 
NOM/ACC.sG  oleivum oleum oleum 
GEN.SG oleivi olivi olei 
DAT.SG oleivo olivo oleo 


The greater stability of frequent stem forms can be explained again by 
memory strength and speed of lexical access. The genitive singular oliv is 
replaced by olet because the stem ole- has higher memory strength and may 
thus be used when a speaker (temporarily) forgets the old form olro-, or 
because ole- can be retrieved more quickly from the lexicon and combined 
with the suffix -7 than the form olivi, with its rarer stem form oliv-. 


12.3 Frequency and irregularity 


In language after language, if there are irregularities in inflection, these 
primarily affect the most frequent lexemes. Our first example comes from 
Koromfe, which has scores of regular verbs like those in (12.16a), and a few 
irregular verbs like those in (12.16b). 


(12.16) a. HABITUAL PAST b. HABITUAL PAST 
kam kame ‘squeeze’ be ben-e ‘come’ 
tarı tare ‘plaster’ bo bol-e ‘say’ 
leli lele ‘sing’ te ter-e ‘arrive’ 


(Rennison 1997: 271-5) 
In Welsh, there are four irregular verbs whose past tense is totally unlike 
the past tense of a regular verb such as gwel- in (12.17a). Three of them are 


shown in (12.17b). 


(12.17) a. gwel-d'see' b. myn-d'go'  gwneu-d'do' | do-d'come' 


1sG gwel-es i esi nesi des i 
2sG gwel-est ti est ti nest ti dest ti 
3sG gwel-odd e aeth e naeth e daeth e 


(King 1993: 183) 
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In Old English, grammars list just four verbs that are totally irregular and 
cannot be fitted into any of the inflectional classes. These are shown in 
(12.18b), and a regular verb is shown in (12.182). 


(12.18) a. ‘bind’ b. ‘be’ ‘do’ ‘go’ ‘want’ 
1SG.PRS binde eom do ga wille 
2SG.PRS bintst eart dest gest wilt 
3SG.PRS bint is dep gap wille 
1—3PL.PRS bindap sint dap gap willap 
1SG.PST band wees dyde eode wolde 
PARTICIPLE gebunden — gedon gegān | — 


Thus, the verbs that tend to show irregularities are those that mean 'be', 
‘do’, ‘go’, ‘come’, ‘say’, and so on — i.e. precisely those verbs that are used 
the most frequently across languages. 

In nouns, the situation is more or less the same. For example, in Lango 
regular plural suffixes are -é, -nì and -í. Some regular and most of the 
irregular nouns are listed in (12.19). 


(12.19) a. réc — réc-é 'fish(es) b. dákó | món ‘woman/women’ 
pónó pün-ni ‘pig(s)’ naké ^ àpirà —'girl(s) 
lé ley-i ‘axe(s)’ icd cà ‘man/men’ 
dind jd ‘person/ people’ 


dyan dòk ‘cattle’ 
gin gigh ^ 'thing(s) 
(Noonan 1992: 83-5) 


Irregular noun plurals in Bulgarian include oko/oci ‘eye(s)’, uxo/usi ‘ear(s)’, 
dete/deca ‘child(ren)’, and Italian has the three irregular nouns uomo/uomini 
^man/ mer', dio/dei ‘god(s)’, bue/buoi ‘ox(en)’. The appearance of words for 
‘cattle’ and ‘ox’ on several of these lists may at first seem surprising - these 
are certainly not among the most frequent nouns in modern Italian and 
modern English. But in modern Lango they may well be (cattle herding is 
one of the main economic activities of Lango speakers), and in older Italian 
and older English the situation may have been similar. 

There are two rather different ways in which frequency may cause 
irregularity in morphology. On the one hand, frequency leads to phonological 
reduction, because frequent expressions are relatively predictable, so that 
speakers can afford to articulate less clearly. This factor must be invoked 
to explain the irregularities in Koromfe verbs in (12.16). Examples from 
English are the verbs have, say and make, which were completely regular in 
earlier English (haved, sayed, maked), but became irregular because they were 
subjected to greater phonological reduction than comparable rarer verbs 
(e.g. said versus played, had versus behaved, made versus faked). 
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On the other hand, frequency leads to memory strength and fast lexical 
access, so that frequent items are less susceptible to analogical levelling 
and other regularizations. So, while frequency causes faster phonological 
change, with respect to morphology it has a conserving, decelerating 
function. For example, the irregular Italian noun uomo/uomini ‘man/men’ 
preserves an old declension type inherited from Latin (homo/homines) that 
was otherwise eliminated by regularizing changes (cf. Latin virgo/virgines 
‘virgin(s)’, Italian vergine/vergini). This conserving effect of frequency is also 
the cause of the Bulgarian irregular plurals oci ‘eyes’ and uši ‘ears’. These 
were originally dual forms, and, because eyes and ears typically occur 
in pairs, these word-forms were probably the most frequent forms in the 
paradigm. Since eyes and ears are among the most frequently used paired 
body parts, it is not surprising that these forms survive. 

From a diachronic point of view, the least well-understood type of 
irregularity is stem suppletion, as seen in Welsh myn-/es-/aeth, Old English 
is/wees, gep/eode, and Lango dákó/món. It is difficult to understand why 
speakers would begin to associate roots that originally came from two 
different lexemes and integrate them as word-forms of the same lexeme. But, 
granted that speakers sometimes do that, the conserving effect of frequency 
will maintain the suppletion in the most frequent lexemes. It is also worth 
pointing out that inflection class differentiation (which we discussed in 
Section 12.1.3) works in exactly the same way: different markers for the 
same meaning/inflectional values can be maintained if the items affected 
are sufficiently frequent, whether owing to the frequency of the inflectional 
value or to lexeme frequency. 


Summary of Chapter 12 


Token frequency is relevant to morphology because frequent words 
occur more predictably in context, are more easily remembered and are 
retrieved faster than rare words. Because speakers favour economical 
structures, the greater predictability of frequent values typically 
results in zero expression (or otherwise short expression). Frequent 
values are also more differentiated — they show less syncretism, 
fewer shared exponents and more cross-cutting values. Because 
frequent words and values are more easily remembered, they are less 
subject to analogical levelling, and this is also one of the reasons why 
irregularities exist mostly in frequent words. Another reason is that 
frequent words are subject to greater phonological reduction, again 
because of predictability. Over time, frequency effects thus shape 
(inflectional) morphological structure in a number of ways. 
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Further reading 


Frequency differences between inflectional values of the same feature are 
discussed (under the name of ‘markedness’) by Greenberg (1966) and 
Croft (1990: ch. 4). Haspelmath (2006) argues that an abstract notion of 
markedness is superfluous once the role of frequency is appreciated. 

The insight that frequency is the explanation for shortness was already 
emphasized by Zipf (1935). For local frequency reversals, see Tiersma 
(1982). For the relation between frequency and irregularity, see Mariczak 
(1980a, b), Werner (1989), Bybee (1995), Nübling (2001), Corbett et al. (2001) 
and Brown et al. (2007). For the view that grammatical structure (including 
morphology) cannot be adequately understood without considering 
frequency effects, see Bybee (2006) and the papers in Bybee and Hopper 
(2001). 


Comprehension exercises 


1. The general correlation between frequency and shortness leads to 
certain expectations about inflectional paradigms. Consider the 
following (partial) paradigms and determine where these expectations 
are fulfilled, and where we should be surprised. 


a. Udmurt conjugation: past tense of uck- ‘look’ 


isc ucki iPL uckimy 
2sG uckid 2PL uckidy 
3sG uckiz 3PL uckizy 


(PerevoScikov 1962: 203) 


b. Even declension: fuu ‘house’ 


SG PL 
NOM juu juul 
ACC juuw juulbu 
DAT  juudu juuldu 


COM juufiun juulfiun 
ABL  juuduk juulduk 
(Malchukov 1995: 9) 


c. Pipil possessive inflection: nu-chi:l ‘my chilli pepper’, etc. 
isa mu-chil 1pL tu-chi:l 
2sG mu-chi:l 2PL amu-chi:l 
3sa i-chi:l 3PL in-chi:l 
(Campbell 1985: 43) 
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d. Tauya possessive inflection: ya-potiyafo ‘my hand’, etc. 


1sG ya-potiyafo IPL sono-potiyafo 
2sG na-potiyafo 2PL tono-potiyafo 
3sc potiyafo 3PL nono-potiyafo 


(MacDonald 1990: 129-30) 


2. The Modern French verb trouver 'find' used to have two different forms 
of the stem in older French, trouv- and treuv-. The former occurred in 
word-forms that were stressed on the suffix, and the latter occurred 
in word-forms that were stressed on the stem. (A dot below the syllable 
indicates the position of the stress.) This stem alternation no longer 
exists in modern French: all forms of the verb trouver have the same 
stem vowel. Why is this change surprising after what we learned in 
this chapter? 


older French modern French 
‘I find" je treuve je trouve 
‘you find’ tu treuves tu trouves 
“he finds’ il treuve il trouve 
‘we find’ nous trouvons nous trouvons 
‘you(PL) find’ vous trouvez vous trouvez 
‘they find’ ils treuvent ils trouvent 


3. Go back to Chapter 10, where morphophonological alternations were 
discussed. Where did we make reference to frequency in that chapter? 
How did what we said there fit with the claims of this chapter? 


Exploratory exercise 


In this chapter we saw that frequency asymmetries and structural 
asymmetries are correlated. For instance, a noun lexeme's singular forms 
tend to be more frequently used than its plural forms, and correspondingly, 
case forms tend to be more differentiated in the singular and singular 
exponents tend to be shorter than their plural counterparts. This pattern 
can be seen even at the level of individual lexemes. Where the plural is 
more frequently used (a frequency reversal), that lexeme is likely to exhibit 
more differentiation in the plural, and shorter plural forms. 

We also claimed that frequent lexemes tend to be irregular. The reader 
may have noticed, however, that this discussion was based on a different 
type of frequency comparison. Rather than looking at the relative frequency 
of different paradigm cells, we compared the frequency of use of different 
lexemes. We did not consider whether irregularity correlates with frequency 
asymmetries within a lexeme. But given the importance of frequency at 
this level, we might ask whether such a correlation exists. In other words, 
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are more frequently used cells in a paradigm more (or less) likely to be 
irregular? You will develop an answer to this question. 

This exercise is based on (but simplifies) Corbett et al.’s (2001) study of 
Russian, and we use Russian for demonstration purposes below. You need 
not choose this language to investigate, but a good frequency dictionary 
or frequency list, with information for individual word-forms, must be 
available. (Hint: to make the process of finding word-form frequencies 
more efficient, it is helpful if the frequency dictionary /list is available for 
electronic searching.) 


Instructions 


Step 1: Choose a morphological pattern that exhibits (strong) stem 
suppletion in at least a handful of lexemes. A few examples of Russian 
singular-plural noun pairs are given below. As can be seen, the examples 
in (12.202) have the same stem for the singular and plural, but the ones in 
(12.20b) exhibit suppletion. 


(12.20) SINGULAR PLURAL GLOSS 
a. zavod zavody ‘factory’ 
student studenty ‘student’ 
b. syn synov ja ‘son’ 
rebénok deti ‘child’ 
Celovek ljudi ‘person’ 


Since suppletion in the Russian examples is according to number, we 
want to ask how frequently the plural is used, compared to the singular, 
and whether this differs depending on whether the noun is regular or 
suppletive. An appropriate measure is thus the ratio of plural frequency to 
singular frequency, e.g. token frequency of ljudi divided by token frequency 
of Celovek. (For demonstration purposes we are ignoring the fact that Russian 
has nominal cases.) 

Step 2: Develop a hypothesis. Based on what you have read in this chapter 
and elsewhere in the book (e.g. discussion of frequency in Section 4.3), 
make a guess about what the answer to the research question will be. For 
instance, would you expect Russian lexemes with suppletive stems in the 
plural to be more/less frequently used in the plural than in the singular? 
What would you expect for lexemes with the same stem throughout? 
Explain your reasoning. 

Step 3: Build a list of lexemes with stem suppletion. Identify as many 
relevant words as possible. 

Step 4: Using a frequency dictionary or frequency list, gather token 
frequency counts. For each of the suppletive lexemes from Step 3, find the 
token frequency of each word-form in its paradigm. Then, do the same for 
at least 10 lexemes that have the same stem throughout the paradigm (the 
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(12.208) type), and which are of similar overall frequency to the suppletive 
lexemes (or as close as possible). 

Step 5: For each lexeme, calculate the relevant frequency ratio. For 
instance, according to one count, the lexeme CELOVEK occurs in the singular 
1678 times per million words of text, and in the plural 1267 times per million 
words of text (Sharoff 2002). Its ratio is thus 0.755 (= 1267/1678). 

Step 6: Evaluate the data and draw conclusions. Compare the relative 
frequency ratios for the two groups. Are there any notable differences in 
the frequency ratios? Do the results match your predictions? Consider the 
implications for understanding the relationship (or lack thereof) between 
the frequency of paradigm cells and irregularity. 

Alternative: Irregularity can be treated as a scale ranging from strong 
suppletion to full regularity (see Chapter 2). Rank different kinds of stem 
irregularity, and look for a correlation between degree of irregularity and 
frequency. (This is much harder!) 


Key to comprehension 
exercises 


Chapter 1 


1. complex words: 
nights (night-nights, cat-cats, trick-tricks, lap-laps,...) 
playing (play-playing, think-thinking, run-running, hop-hopping,...) 
affordable (afford-affordable, accept-acceptable, form-formable,...) 
indecent (decent-indecent, accurate-inaccurate, adequate-inadequate,...) 
searched (search-searched, cough-coughed, pass-passed, laugh-laughed,...) 
hopeful (hope-hopeful, mercy-merciful, colour-colourful, deceit-deceitful,...) 
redo (do-redo, think-rethink, absorb-reabsorb, read-reread,...) 


not complex: owl, religion, indolent, bubble, during 


2. chang ‘music’ ché ‘vehicle’ 
chuán ‘ship’ ct ‘word’ 
deng ‘lamp, light’ dian ‘electric’ 
ding ‘top’ dóngwi ‘animal’ 
fang ‘house’ fei ‘fly’ 
hua ‘flower’ jt ‘machine’ 
jido ‘leg’ ké ‘customer, visitor’ 
li ‘power’ qi 'steam, gas' 
shan ‘mountain’ shi ‘sight, see’ 
shu ‘number’ shut ‘water’ 
wei ‘tail, part’ xué ‘study of’ 
you ‘oil’ yu ‘fish’ 
yuan ‘garden’ zhi ‘paper’ 
3. wari ‘sleep’ eeli ‘leave’ 
in- T e7- ‘they’ 
ix- ‘you(PL) at- 'you(sc) 
og- ^we' n- "PRESENT TENSE' 


‘PAST TENSE’ 
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Note that the third person singular (‘he or she’) does not correspond to a 
morphological constituent, but the word neeli ‘he or she leaves’ can (and 
must) be interpreted as having the meaning 'he or she' because of the 
absence of any other person marker. 


4. In Hebrew, the lexical meaning is represented by the consonants, so the 
root dbr corresponds to sPEAK, which shows up in the word-forms diber 
‘he spoke’, dibra ‘she spoke’, dibur ‘speech’. The vowels correspond to 
grammatical information and word-class. There are four identifiable 
sets of word pairs in this data that exhibit a morphological relationship. 


Set 1: diber ‘he spoke’ dibra ‘she spoke’ 
kimet ‘he wrinkled’ kimta ‘she wrinkled’ 
milmel ‘he muttered’ milmla ‘she muttered’ 


In Set 1, the masculine (‘he’) past tense has the structure CiC(C)eC, where 
C indicates a consonant, and anything in parentheses is optional. The 
feminine (‘she’) past tense has the structure CiCC(C)a. The difference 
between masculine and feminine (past tense) verbs thus has to do with both 
the vowel quality, and the position of the vowel in the word. 


Set 2: ħašav ‘he thought’ ħašva ^ ‘she thought’ 
kalat ‘he received’ kalta ‘she received’ 
sagar ‘he shut’ sagra ‘she shut’ 


Set 2 is similar to Set 1, except that the masculine and the feminine have 
a for both vowels. The difference between masculine and feminine (past 
tense) verbs is thus represented only by the position of the vowel in the 
word. 


Set 3: dibur ‘speech’ 
kimut ‘wrinkling’ 


Set 3 contains abstract nouns derived from verbs. These nouns have the 
form CiCuC. 


Set4:  ma-klet ‘radio receiver’ 
ma-sger — "lock 
ma-hsev ‘computer’ 


Set 4 contains concrete nouns. In this set, there is a prefix ma-, and a stem 
with the form CCeC. 


Chapter 2 


1. Allomorph 1: -ayaal (e.g. awowe, awowayaal ‘grandfather(s)’). Conditions: 
Used when the stem ends in an [e], which is removed in the plural 
form. 

Allomorph 2: -oyin (e.g. baabaco, baabacooyin “palm(s)’). Conditions: 
Used when the stem ends in an [o]. 
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Allomorph 3: -aC, where C stands for a consonant that is the same as 
the final consonant in the stem (e.g. beed, beedad ‘egg(s)’). Conditions: 
Used when the stem has only one syllable (and ends in a consonant). 

Allomorph 4: -Co, where C stands for a consonant that is the same 
as the final consonant in the stem (e.g. cashar, casharro ‘lesson(s)’). 
Conditions: Used when the stem has (at least) two syllables and ends in 
a consonant. 

Plurals: tuulooyin, togag, albaabbo, bustayaal 


2. Phonologically conditioned. The past tense allomorph [d] appears in 
verbs whose stems end in a vowel or any voiced consonant except [d]. 
The allomorph [t] appears in verbs whose stems end in any voiceless 
consonant except [t]. The allomorph [od] appears in verbs whose stems 
end in [t] or [d]. 

3. Venezi-a Venezi-ano (0 segments difference) 

Trent-o Trent-ino (0) 

Milan-o Milan-ese (0) 

Savon-a Savon-ese (0) 

Volterr-a Volaterr-ano (1) 

Piacenz-a Piacent-ino (1) 

Ancon-a Anconet-ano (2) 

Palerm-o Palermit-ano (2) 

Trevis-o Trevigi-ano (2) 

Gubbi-o Eu-gub-ino (3) 

Bressanon-e Brissin-ese (4) 

Palestrin-a Prenest-ino (6) 

Domodossol-a Dom-ese (7) 

Ivre-a Eporedi-ese (74) 

Napoli Partenopeo (full suppletion) 
Bologn-a Petroni-ano ^ (full suppletion) 

Chapter 3 

1. a. tonalchange 
b. suffixation 
c. vowel change (fronting) 

d. reduplication 
e. infixation 
f. circumfixation, vowel change 

2. | /his/ /19/ /13i/ /gud/ /z/ 

N Vo _v ADJ Vv 

‘hear’ ‘PROG’ ‘again’ ‘good’ ‘38G PRES’ 
3. |/X/ apy > EXEL bi 

1% ‘more x’ 
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/onX/ 
‘not x’ 


/ iX/, 
'do, again" 
o /Xīli/ ss 
‘in a x manner’ 
/CVCVC/ s 
‘having the quality x’ 


/X4/y, 
“person who studies x’ 


nd ADJ 


e 


[/ Xi/x 
"field of study x’ 


Subtraction is an attractive analysis because the final consonant 
of the feminine is different in each of the given words ([t,d,g,Lz]). A 
morphological relationship exists if a group of words shows identical 
partial resemblances in both form and meaning. These words 
show partial resemblance in meaning (they are masculine/feminine 
pairs), but since the final consonant is different in each of the feminine 
forms, the only generalization that can be made about form of all of 
the words is that the masculine form is one phonological segment 
shorter than the feminine form. If we tried to derive the feminine from 
the masculine, each word would require a different rule. If we delete a 
segment of the feminine to form the masculine, only one rule is needed. 


Chapter 4 


1. 


actual: abundance, libertarian, replay 
possible: fraternitarian, itinerance, rebagelize 
impossible: happytarian: The morpheme -(t)arian can attach only to 
bases that end in t or ty, e.g. liberty-libertarian, humanity-humanitarian, 
document-documentarian. (It is possible to have a base that is truncated 
at /t/, e.g. vegetable-vegetarian, but this does not apply to happy.) 
penchance: While penchant-penchance looks formally similar to 
abundant-abundance, in the latter pair, the noun abundance is derived 
from the adjective abundant. In the first pair, penchant is a noun, so 
penchance cannot be derived from it. 
reknow: re- must attach to a verb that expresses a repeatable action. 


compositional meaning: ability, legalize, modernize, morality, popularity, 
vaporize 
non-compositional meaning: authority, community, materialize, specialize 


The (a) forms all have allomorphy (stress placement is different in the 
base form and in the derived form, e.g. pépular, popularity); the (b) forms 
have no allomorphy (e.g. módern, módernize). 


KEY TO COMPREHENSION EXERCISES 285 
n— — AB SE] 


4. Finnish: cumulative expression for person and number (e.g. me 
indicates both 1ST PERSON and PLURAL NUMBER), and zero expression 
in the nominative (unlike in the other cases, there is no form overtly 
marking the nominative case). 

Ndebele: empty morph. When the root contains only a consonant or 
consonant cluster, yi- is prefixed to the root in the imperative form. (The 
underlying generalization is that the word-form must have at least two 
syllables, and yi- is added when this condition would not otherwise 
be met.) However, these forms bear an imperative marker (-a, e.g. yi- 
dl-a ‘eat!’). Since yi- does not express any distinct inflectional or lexical 
meaning, it is an empty morph. 

Serbian: the answer depends on the segmentation that is assumed. 
One possibility is: 


SINGULAR PLURAL SINGULAR PLURAL 
IST PERSON govor-i-m govor-i-mo  1ST PERSON fres-e-m tres-e-mo 
2ND PERSON govor-i-§ govor-i-te 2ND PERSON tres-e-5 tres-e-te 
3RD PERSON govor-i  govor-e 3RD PERSON tres-e tres-u 


Under this analysis, the Serbian data exhibit all three phenomena. 
The morphemes -m, -mo, -š, -te, and -e/-u express person and number 
cumulatively because it is not possible to subdivide them into morphemes 
meaning ‘singular’, ‘plural’, '1st person’, etc. The forms -i and -e, which 
occurs in five of the six word-forms, are empty morphs because they do not 
directly correspond to any aspect of meaning. The third person singular has 
zero expression because there is no morpheme directly corresponding to 
this grammatical meaning. 
Another possible segmentation is: 


SINGULAR PLURAL SINGULAR PLURAL 
IST PERSON govor-im govor-imo 1ST PERSON  fres-em  tres-emo 
2ND PERSON govor-iš govor-ite 2ND PERSON tres-eš tres-ete 
3RD PERSON govor-i  govor-e 3RD PERSON tres-e tres-u 


This analysis has a disadvantage, in that it does not capture that the suffixes 
that attach to govor- and very similar to the ones that attach to tres-. However, 
under this segmentation, the Serbian data still has cumulative expression, 
but no empty morphs or zero expression. 


Chapter 5 
1. caminabas TENSE: PAST 
PERSON: 2ND 
NUMBER: SINGULAR 
insulam CASE: ACCUSATIVE 


NUMBER: SINGULAR 
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cantabit 


books 


ovci 


incal 


bigger 


MOOD: INDICATIVE 
ASPECT: INFECTUM 
TENSE: FUTURE 
PERSON: 3RD 
NUMBER: SINGULAR 


[NUMBER: PLURAL] 


CASE: DATIVE 
NUMBER: SINGULAR 


PERSON: 3RD 
NUMBER: PLURAL 


[DEGREE: COMPARATIVE] 


2. affirmative polarity 


negative polarity 


IMPERFECTIVE PERFECT HABITUAL 
PRESENT kat-zawa kat-nawa kat-da 
PAST kat-zawa-j kat-nawa-j kat-da-j 


IMPERFECTIVE PERFECT HABITUAL 
PRESENT kat-zawa-č kat-nawa-c kat-da-¢ 
PAST kat-zawa-c-ir kat-nawa-¢-ir kat-da-¢-ir 


3. Denominal verbs 


act like N: cannibalize 

put into N: categorize 

cover with N: butter 

use N: phone, skate, ski (new category) 


create N: unionize, terrorize, peel (new category) 


Deadjectival verbs 
factitive: flatten, legalize, blacken, modernize 


(i) | not relevant to the syntax (= derivational) 
(ii) not obligatory expression (= derivational) 


(iii) limited applicability (= derivational) (e.g. *longly, *brownly, etc.) 


(iv) (difficult to apply) 


(v) relatively abstract meaning (= more typical of inflection than 


derivation) 
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(vi) compositional meaning (= typical of inflection, but also possible 
for derivation) 

(vii) expression close to the base (= derivational) (there are examples 
of derivation followed by -ly, e.g. abus-ive-ly, content-ed-ly, but 
this is consistent with either derivation or inflection. The decisive 
examples are -ly followed by derivation, e.g. clean-li-ness, man-li- 
ness, like-li-hood). 

(viii) no base allomorphy (= typical of inflection, but also possible for 
derivation) 

(ix) changes word-class (= derivational) 

(x) nocumulative expression (= derivational) 

(xi) notiterable (= typical of inflection, but also possible for derivation) 


(e.g. *nicelyly) 


Chapter 6 


1. 


The primary restriction prohibiting *bountifulity, *sonorantity, *aimlessity, 
*darkishity and *fearsomity has to do with borrowed vocabulary strata. 
The suffix -ity attaches only to Latinate bases. The suffixes -ic, -able, -ive, 
-ous and -al are all Latinate, so they create the Latinate bases electric, 
probable, captive, curious and abnormal. However, -ful, -ant, -less, -ish 
and -some are all Germanic, so bases ending in these suffixes are also 
Germanic. (Also, since Germanic bases routinely form abstract nouns 
with -ness, even if it were possible to add -ity, some of the impossible 
words would be subject to synonymy blocking (e.g. aimlessness would 
likely block *aimlessity)). 


*reknow is subject to semantic restrictions. The prefix must attach to a 
verb that expresses a repeatable action. 

*happytarian is subject to phonological restrictions. The morpheme 
-(t)arian can attach only to bases that end in t or ty, e.g. liberty-libertarian, 
humanity-humanitarian, document-documentarian. (It is possible to have a 
base that is truncated at /t/, e.g. vegetable-vegetarian, but this does not 


apply to happy.) 
The suffix -simo is used when the stem is monosyllabic. The suffix -ma 
is used when the stem has (at least) two syllables. 


djdvas-ma ‘reading’ kóp-simo ‘cutting’ 
mángo-ma ‘squeezing’ lú-simo ‘bathing’ 
skónda-ma ‘stumbling’ pjá-simo ‘seizing’ 
tinay-ma ‘shaking’ trék-simo ‘running’ 


*musting is not an example of blocking; must does not have a 
form musting because it is a modal verb (and have to and must are 
interchangeable otherwise, e.g. I have to go/I must go.) 

“foots is blocked by feet. Irregular (inflectional) forms, including 


288 KEY TO COMPREHENSION EXERCISES 
e= | 


feet, must be stored in the lexicon and lexical items block otherwise 
productive rule application. 

*cooker is blocked by cook. 

*bishopdom is blocked by bishopric. 

*teacheress is blocked by teacher. 

*slickize isnotanexample of blocking; -ize does not attach monosyllabic 
words, or words that end in a stressed syllable. 

*certainness is blocked by certainty. 

*sisterlily is not an example of blocking; the impossibility of this word 
probably represents a phonological constraint against adjacent identical 
syllables (-li-ly). 


5. The suffix -(er)ati developed from analogical extensions. Literati is 
an established term and was the model for the original analogical 
extension. Glitter-ati (« glitter) was the first term coined by analogy, 
and not coincidentally the two stems contain both the sequence lit(t)er, 
which was the basis for the analogy. By analogy the pattern then spread 
to other stems ending in er (liber-ati (cf. liber-al), chatter-ati, soccer-ati), 
and was even further extended to stems that do not end in er (digit-erati, 
cf. digit-al). 

The suffix -scape is similarly the result of analogy, with landscape as 
the original model, and seascape as the first term coined by analogy. This 
was followed by cloudscape, skyscape, waterscape, winterscape, etc. This 
example is exactly parallel to -gate (Watergate, Irangate, coingate). 


Chapter 7 
1. 
N 
VAN 
N N N 


family ^ planning adviser 
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N 
N 
N 
LN 
N N N N 
undersea ay repair team 
N 
Palio s 
N N 
NAN 
ADJ N N N 
mM a phone system 
N 


ADJ N N N 


mad cow disease hysteria 
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N 
N 
N 
PAN 
N N N N N 
World Trade Center rescue worker 
N 
N 
PAN 
N N N N 
credit card agreement form 
N 
N 
VAN 
ADJ N N N 


major league baseball ^ game 
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2. asvakovida- endocentric 
bahuorthi- exocentric 
dioyarüpa- exocentric 
grhapati- endocentric 
maharaja- endocentric 
mahatman- exocentric 
priyasakht- endocentric 
rajarsi- appositional 
Suklakrsna- appositional 
sukhaduhkha- coordinative 

8. 

ADJ 
V 
V 
Pref ADJ wo App 
in- móvil -iz -able 
‘immobilizable’ 
4 |/Xyj/ ap} & a/a) e 
‘x! y 

Chapter 8 

1. lumangoy 
wumagayway 
natakot 
nauhaw 
binuhat 
pinunit 
pinunasan 


ADJ 


Pref ADJ vee ADs" 
in- movil -iz -able 
‘unmobilizable’ 
/ XoYyj / ADJ 
having y with 
the property x' 


2. Ifthe description in Figure 8.3 were adopted, the following rule schema 
from Figure 8.2 could be eliminated: 


{ [ / XVS/ Nomsa ] I /XV/ accs ] á [ / XV / onse ] z [ /XVZ/ Nomai l á 


[/XVs/ ceu LZ L/S SONS ess] 
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The modified version of Figure 8.2 would look like: 


XVZ XVs 
XV XVs 
XVZ Xon 


XVs 

XV Xes 

XV Xon 
Xos !Xi Xas Xes Xus Xuðes XV Xes Xi Xis 
Xo !Xus Xa Xes Xu Xuóes XV  Xes Xi Xis 
Xu Xon Xa Xon Xu Xuóon XVs Xon Xeos Xeon 


Xi Xes Xu  Xudes 
Xi Xes Xu Xudes 
Xis Xon Xus Xudon 


3. An answer to this question depends on how we choose to define the 
classes. On a narrow definition, it is possible to identify at least 46 
'irregular' past tense patterns (shown below). Of these, 22 contain only 
one word (catch/caught), or have only words built on the same stem 
(stand/stood, understand/understood, withstand/withstood). One answer, 
then, is that the words belonging to these 22 classes are truly irregular, 
whereas words belonging to the remaining 24 groups at least exhibit 
some kind of shared pattern and therefore belong to an inflection class. 

Looking more broadly, however, some of these small and singleton 
groups have traits in common. For instance, there is only one word 
with the pattern [aC] — [ot] (catch/caught), but there are five other 
words with past tense [ot]: seek/sought, teach/taught, bring/brought, think/ 
thought and buy/bought. Also, words that have both vowel change and 
a suffix ([d] or [t]) can plausibly be grouped with words that have only 
one or the other pattern. And so on. The only verb that does not form 
a past tense in any way similarly to other words is be/was (even highly 
suppletive go/went has a final [t]!). 

Thus, for almost all 'irregular' verbs, inflection classes can be 
established at a broad level. 


KEY TO COMPREHENSION EXERCISES 95 


[d] or [t] patterns 
[k] > [d]: make/made 
[d] > [t]: — bend/bent, build/built, lend/lent, send/sent, spend/spent 


Q [t]: — burn/burnt, learn/learnt, spill/spilt 


Vowel change patterns 
[i] [e]: —eat/ate 
[i] — [e]: ^ bleed/bled, breed/bred, feed/fed, lead/led, meet/met, plead/pled, read/ 


read, speed/sped 
[i] > [9]: see/saw 
[i] [o]:  freeze/froze, shear/shore, speak/spoke, steal/stole, weave/wove 
[1] — [e]: — bid/bade, forbid/forbade, forgive/forgave, give/gave 
[1] > [æ]: — begin/began, drink/drank, ring/rang, shrink/shrank, sing/sang, sink/ 


sank, sit/sat, spit/spat, spring/sprang, stink/stank, swim/swam 

[1] > [a]: —cing/clung, dig/dug, fling/flung, sling/slung, spin/spun, spring/ 
sprung, stick/stuck, sting/stung, string/strung, swing/swung, win/ 
won, wring/wrung 


[e] > [o]: awake/awoke, break/broke, wake/woke 

[e] > [o]: forsake/forsook, mistake/mistook, overtake/overtook, shake/shook, 
take/took 

[e] > [o]: forget/forgot, get/got, tread/trod 

[e] > [o]:  bear/bore, swear/swore, tear/tore, wear/wore 

[a] > [e]: — fall/fell 

[æ] > [a]: hang/hung 

[aj] > hi]: — hide/hid, slide/slid 

[aj] > lel: lie/lay 

[aj] > [æ]: bind/bound, find/found, grind/ground, wind/wound 

[aj] > [o]:  dive/dove, drive/drove, ride/rode, rise/rose, shine/shone, strive/strove 

[aj] A [u]: fly/flew 

[aj] A [a]: | strike/struck 

[o] —[u]: | draw/drew 

[o] > [e]: — hold/held, uphold/upheld, withhold/withheld 

[o] > [u]:  blow/blew, grow/grew, know/knew, overthrow/overthrew, throw/ 
threw 

[u] > [o]:  shoot/shot 

[u] ^ [o]:  choose/chose 

[Aj] > h]:  bite/bit, light/lit 

[^] > [e]:  become/became, come/came, overcome/overcame 

[A] [a]: run/ran 

[aj] > [o]: fight/fought 


J 
[aj]/ [^j] — [o]: smite/smote, stride/strode, write/wrote 


Past tense vowel change + [d] or [t] patterns 
] 


[i] ^ [ed]: flee/fled 
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[iC] A [Ct]: — creep/crept, deal/dealt, dream/dreamt, feel/felt, keep/kept, kneel/ 
knelt, leap/leapt, leave/left, mean/meant, sleep/slept, sweep/ 
swept, weep/wept 

[e] > [ed]: say/said 

[el] [old]: sell/sold, tell/told 

[u] [1d]: do/did, overdo/overdid 

[uC] > [oCt]:  lose/lost 


Weak suppletive patterns with past tense [ot] 
[iC] > [ot]: seek/sought, teach/taught 
[19(C)] > [ot]:  bring/brought, think/thought 
[æC] > [ət]: catch/caught 

[aj] > [ot]: buy/bought 


Other suppletive patterns 

be/was,were 

forego/forewent, go/went 

stand/stood, understand/understood, withstand/withstood 


Zero expression 
beat/beat, beset/beset, bet/bet, bid/bid, broadcast/broadcast, burst/burst, cast/cast, 


cost/cost, cut/cut, fit/fit, forbid/forbid, hit/hit, hurt/hurt, knit/knit, let/let, put/put, 
quit/quit, rid/rid, set/set, shed/shed, shut/shut, slit/slit, spit/spit, split/split, spread/ 
spread, thrust/thrust, upset/upset, wed/wed 


4. Based on the given data, there are two factors that may have 
motivated the change. The first issue is grammatical gender. Both class 
(ii) and class (iii) contain masculine nouns, whereas class (i) contains 
feminine nouns. If the inflectional pattern for class (i) was considered 
by speakers to be the canonical feminine pattern, and class (iii) was 
considered to be the canonical masculine pattern, this could have 
motivated inflectional change in class (ii) towards the more canonical 
masculine pattern. 

The second potential factor is overlapping inflectional patterns. Class 
(i/ii) and class (iii) have similar patterns in the accusative (final [n]) and 
the dative (final [i]). The similarity of class (ii) to the class (iii) pattern in 
these paradigm cells could have served as the basis for shift in the other 
cells (NOM and GEN) to be more similar to class (iii). 


5. The following is an inheritance hierarchy for Russian nominal 
inflection classes. Inflectional information that is shared by most or all 
of the classes is introduced at the highest node in the hierarchy (e.g. 
dative, instrumental and locative plural forms), and more specific 
information is introduced at lower nodes (e.g. the genitive plural form). 
In a few instances information at lower nodes overrides the inherited 
information (e.g. nominative plural form in (iv)). 
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NOM XZ Xy 
ACC XZ Xy 
GEN XV XZ 
DAT XV Xam 
INST XZ Xami 
LOC Xe Xax 


ACC - NOM 


SG PL SG PL SG PL SG PL 
NOM X Xy NOM Xo !Xa NOM Xa Xy NOM x' X'i 
ACC X Xy ACC Xo !Xa ACC Xu Xy ACC X' X'i 
GEN Xa Xov GEN Xa X GEN Xy x GEN X'i X'ej 
DAT Xu Xam DAT Xu Xam DAT Xe Xam DAT X'i X'am 
INST Xom = Xami INST Xom Xami INST Xoj Xami INST X'ju X'ami 
LOC Xe Xax LOC Xe Xax LOC Xe Xax LOC !X'i X'ax 
ACC = NOM ACC = NOM !ACCSG = NOMSG ACC = NOM 


(i) (iv) (i) (iii) 


Intuitively, an important fact about this data is that the accusative 
is almost always the same as the nominative; the singular forms in (ii) 
(komnata, komnatu) are the exception. This generalization is captured 
at the highest node with the notation that ACC = NOM, and it then 
is inherited by lower nodes, and overridden in the class represented 
by komnata. (Note that all of the nouns in this exercise are inanimate. 
Animate nouns belonging to class (i), however, exhibit the pattern in 
ACC = GEN in the singular, as do all animates in the plural.) 


The quantitative criterion does not provide clear evidence on the issue. 
In other inflection classes the second person singular is syncretic with 
the plural, but so is the first person singular (unlike in the verb be). 
The two inflection classes thus do not exhibit identical patterns of 
syncretism. Moreover, the relevance of this criterion is particularly 
unclear in the past tense — only be has more than one form. 
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I am walk was walked 
you(sc) are walk were walked 
he/she is walks was walked 
we are walk were walked 
you(PL) are walk were walked 
they are walk were walked 


According to the syntactic criterion, 2sc are/were is systematically 
syncretic with pluralare/were. When the verb has to agree simultaneously 
with the second person singular and a plural subject, the same verb can 
agree with both: Either we or you are/were supposed to play in the Bulgaria 
match. 

The diachronic evidence also suggests systematic syncretism. In the 
history of English, the 2sg had a form distinct from the plural form: art 
in the present tense, wast in the past tense (wast had earlier replaced 
etymological were under influence of other verbs that had the form Xt 
in the 2sg past tense). Over time, the pronoun you displaced thou, and 
around the same time, are/were displaced art/wast. The ultimate result 
was that the 2sc was reformed on the basis of the 2PL. 


PRS PST 
I am was 
thou(sc) art wast > you are were 
he/she is was 
we are were 
you(PL) are were 
they are were 
Chapter 9 
1. Referentiality and morphological cohesion are not relevant to these 


examples. Moreover, English does not generally use special segmental 
markers (e.g. interfixes) to indicate compounds, so this criterion is 
uninformative. However, four tests can be applied: phonological 
cohesion, anaphoric replacement, expandability and coordination 
ellipsis. 


backboard: compound 


a. Phonological cohesion: one main word stress on the first syllable 
(backboard) 

b. Anaphoric replacement impossible: *Every basketball court has one 
scoreboard and two back ones. (i.e. two backboards) 

c. Notexpandable: *The ball hit the very backboard. (i.e. the board in the 
very back) 

d. Coordination ellipsis impossible: *The back and scoreboards were 
damaged by vandals. 
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backdoor: phrase 

a. Lacks phonological cohesion: two main word stresses (báckdóor) 

b. Anaphoric replacement possible: My house has one front door and two 
back ones. (i.e. two backdoors) 

c. Expandable: When you get to the house, go through the far backdoor (i.e. 
the door in the far back of the house). 

d. Coordination ellipsis possible: The back and side doors were mistakenly 
left open. 


back seat: phrase 

a. Lacks phonological cohesion: two main word stresses (back séat) 

b. Anaphoric replacement possible: My car has one front seat and two 
back ones. (i.e two back seats) 

c. Expandable: There is a lever in the trunk to release the far back seat (i.e. 
the seat in the far back of the car). 

d. Coordination ellipsis possible: The back and front seats we had 
upholstered in red velvet, but the middle row we upholstered in purple 
leather. 


2. Third singular present tense -s has no freedom of stem selection; it 
must attach to verbs. It also has no freedom of movement. For instance, 
it cannot be clefted separately from its verb. (*It is -s that Mary walk.) 
Finally, it undergoes morphophonological alternation: [-s], [-əz] or [-z], 
depending on the final consonant of the verb stem. 


3. Polish: clitics attach after their hosts (they are enclitic), so the clitic go 
cannot occur in sentence-initial position. 

French: the pronoun occurs in a position that requires emphatic 
(technically, contrastive) stress, so use of the clitic tu is not grammatical. 
Free form toi should be used. 

Serbian: the clitic je is a second-position clitic, but in this sentence it 
occurs in fourth position. 

Ponapean: the position of the completive suffix -la indicates that keng- 
wini- is a compound verb with an incorporated noun (wini- ‘medicine’ 
is the dependent member). However, dependent members generally 
cannot be referential (including in Ponapean), so the demonstrative -o 
(which is associated with wini-) makes the word ungrammatical. 


4. Asdiscussed in this chapter, the pronominals are outside of the domain 
of word stress. This is evidence that they are clitics. However, clitics do 
not generally undergo morphophonological alternations; in this respect 
the pronominals behave like affixes. 


5. In Lithuanian, s(i) is like a clitic in that it appears either after the 
verb root (when not negated), or before it (when negated). It thus 
has freedom of host selection. It is like an affix in that it undergoes a 
morphophonological alternation. 
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Two criteria from Table 9.1 seem to be relevant. First, the first person 
active marker a- is prefixed to the verb root jogua ‘buy’ in (11.26a), 
but immediately precedes the noun mba’e ‘thing’ in (11.26b). If we 
assume that a- attaches only to verbs, and not to nouns, then the 
entire incorporated structure apparently serves as the domain of 
affixation. This is parallel to example (9.4). Second, as (11.27) shows, the 
dependent noun mba'e is not expandable — it cannot be modified by the 
adjective hepy ‘expensive’. Both of these criteria suggest that (11.26b) is 
a compound. 


Chapter 10 


1. 


Morphophonological alternation. The most important issue is that stem 
voicing is both morphologically and lexically conditioned. It occurs 
in the plural, but not under the same phonological conditions in the 
possessive (e.g. [fs] in leaf's), and does not occur in all relevant words 
(e.g., briefs, roofs, cuffs). 

The affected sounds form a natural class (voiceless obstruents), and 
produce a natural class (voiced obstruents). The 'input' and 'output' are 
also phonetically close. While phoneticcoherence and phonetic closeness 
are typical traits of automatic alternations, morphophonological 
alternations may or may not be phonetically coherent. These properties 
thus provide no solid evidence either way. 

Finally, the alternation does not apply to loanwords, its application 
is not sensitive to speech style, and it does not create a new segment 
(voiced obstruents exist in English independently of the alternation). It 
does not apply across word boundaries (a lea[f] sat on the porch; *a lea[v] 
sat on the porch). 


Relic alternation. The alternation applies to very few words, all of which 
are comparative adjectives: long/longer, strong/stronger, young/younger. 
(The rare example wrong/wronger seems to vary; some speakers have 
the alternation, and some have only [n] in both word-forms.) Nouns 
do not exhibit the alternation (e.g. sing/singer). And it is not productive 
even in comparatives (e.g. winningest, blinger than bling). 


a. Morphophonological alternation: the palatalization rule does not 
apply to loanwords (laasiisii ‘licence’) or to new instances of the 
conditioning environment (féeba ‘cooked cassava flour’), so it must 
be at least in part lexically conditioned. 

b. Automatic alternation: the rule applies across word boundaries 
and in a phonetically motivated environment. 

c. Automatic alternation: the palatalization rule applies in a 
phonetically motivated environment, and whenever that 
environment arises. This rule is similar to the Hausa example in (a), 
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but the crucial difference is that in Modern Greek, the palatalization 
rule applies to loanwords (Greek [cinino] ‘chinine’). 

d. Morphophonological alternation: the fact that the alternation 
occurs only in ‘certain morphological forms’ indicates that the rule 
is morphologically conditioned. It is also lexically conditioned, as 
evidenced by the fact that it fails to apply to some (native) words. 


4. Phonetic distance: the sounds that are affected sometimes differ from 
their replacements by more than one phonological feature (e.g. ph and 
sh differ in both place (labial vs. alveolar) and manner (stop vs. fricative) 
of articulation). 

No new segments: based on the given data, at least some of the 
output sounds are independently attested: j in gijimisa ‘make run’; sh in 
shaya 'beat'. It is unclear from this data whether the remaining output 
sounds (esp. ny, nj) are new segments, or also independently exist in 
the language. 

Phonological motivation: in (10.11), the affected sounds are not 
immediately adjacent to the conditioning environment (-w). Given that 
automatic alternations are phonologically motivated, and phonological 
rules generally apply to adjacent sounds, we can take this as further 
evidence that the alternation in (10.11) is morphophonological in 


nature. 
Chapter 11 
1. [/support/y Isupportive/ A 
SB] — OBJ | <> | OBLof 
| | | 
agent patient patient 
2. [/X/y loutX/ y 
SBJ o SB — OBJ 
| | | 
agenti agent; — patient; 
^ Aj acts,’ ' Aj acts better than Bj 


3. Section 11.3.3: ‘In contrast to (complex) event nouns, agent nouns in 
English ... do not seem to inherit the verb's argument structure.' In the 
example ruler over a large empire, ruler is an agent noun derived from 
the verb rule, but it seems to have inherited the argument structure of 
the verb (The king rules over a large empire). Ruler over a large empire has a 
possessive phrase, similar to explorer of Antarctica. 


Chapter 12 


1. a. The plural is usually less frequent than the singular, so we expect 
that in Udmurt, plural inflected forms will be longer than singular 
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2. 


inflected forms. This expectation is fulfilled. Since the third person 
is usually more frequent than other values of the feature PERSON, 
we likewise expect that the third person will be shorter. This 
expectation is not fulfilled; the first person singular form is shorter 
than the third person singular form. 


b. The plural forms are longer than the singular forms in Even; this 


fits expectations for the same reasons as in (a). The nominative is 
also shorter than the other case forms; this is expected given that 
the nominative is usually the more frequently used form. 

c. In the Pipil example, the plural forms are longer than the 
corresponding singular form in two instances; they are the same 
length in one (1sc nuchi:l ‘my chilli pepper’; 1px tuchi:l ‘your chilli 
pepper’). This mostly matches expectations. Also, the third person 
singular form is the shortest — this also matches expectations. 

d. The Tauya data entirely matches expectations. The plural forms are 
longer than the singular forms, and the third person singular form 
exhibits zero expression and is therefore the shortest of all forms. 


In this chapter it was noted that within a paradigm, the forms with 
higher relative frequency usually serve as the model for analogical 
levelling, and forms with lower relative frequency usually undergo 
analogical levelling. Since singular forms are usually more frequent 
than plural forms, we would expect the singular to have served as the 
model for analogical levelling in the French example. The change is 
therefore surprising because the singular forms have undergone stem 
levelling based on the (presumably less common) plural forms. 


Frequency was discussed in Section 10.2, where it was noted that in 
German, the lexeme BE, which has high absolute frequency, preserves 
an alternation between r and s (war/ge-wes-en ‘is/been’) that has been 
levelled in other words (las/ge-les-en 'read(PRs)/read(PART)). This 
fact fits with the discussion in this chapter in that lexemes with high 
absolute frequency resist analogical change, whereas lexemes with 
low absolute frequency are likely to undergo it. 
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Glossary of 
technical terms 


abessive: an inflectional value of the feature CAsE: ‘without, lacking’ (e.g. 
Udmurt val-tek [horse-ABE.sc] ^without the horse") (Section 12.1.2). 


ablative:aninflectional value of the feature CASE: ‘(away) from’ (e.g. Huallaga 
Quechua mayu-pita[river-ABL.sc] '(away) from the river’) (Section 5.1). 


absolutive: an inflectional value of the feature CAsE that is used to mark 
both the subject of an intransitive verb and the object of a transitive verb 
(Section 11.1.5). 


acceptability judgement: a native speaker’s assessment of whether a word 
or sentence is a possible word/sentence of the language (Section 6.1). 


accusative: an inflectional value of the feature CASE that is used to mark the 
direct object (e.g. Latin Marcus rosa-m [rose-Acc] vidit ‘Marcus saw a rose") 
(Section 5.1). 


acronym: an abbreviation consisting of initial letters that are read like an 
ordinary word, e.g. NATO [neitou] (as opposed to alphabetism) (Section 3.1.5). 


action noun = event noun. 


active: an inflectional value of the feature voice (‘semantic agent is the 
syntactic subject’) (Section 5.1). 


actual word (= usual word): a lexeme that is familiar to most speakers 
(Section 4.2) (cf. neologism, possible word, occasionalism). 


adjunct: a participant in an event that is optionally expressed (Section 
11.1.5) (cf. argument). 


adposition: a term that refers collectively to prepositions and postpositions 
(Chapter 7 Exploratory Exercise). 


affirmative: an inflectional value of the feature moop that indicates the 
veracity of the argument. 
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affix: a morpheme that must attach to a base and cannot occur by itself. 
Usually a short morpheme with an abstract meaning (Section 2.2) (subtypes: 
circumfix, infix, prefix, suffix). 

affixation: the operation of stringing together a base and an affix (Section 
3.1.1). 


affix compound: a morphological pattern that involves at least two stems 
and one affix (Section 11.2.3). 


agent: a semantic role; the instigator of an action (Chapter 11). 


agentive adjective: a deverbal adjective denoting an action performed by 
the modified noun, e.g. English typing, as in a typing monkey sat on the floor 
(Section 5.2). 


agent noun: a deverbal noun that refers to the agent participant of the 
action, e.g. English drinker (Sections 5.2, 11.3.3). 


agglutination = concatenation. 


agglutinative language: a language in which almost all words are formed 
by concatenation of morphemes (cf. fusional language, isolating language) 
(Section 4.2). 


agreement: a syntactic relation that requires related constituents to show 
identical marking for certain inflectional values (e.g. verbs often agree for 
number with subject or object nouns) (Section 5.3). 


allative: an inflectional value of the feature CASE: ‘motion towards, onto’. 


allomorph (= morpheme alternant): two roots or morphological patterns 
are allomorphs (of the same abstract morpheme) if they express the same 
meaning and occur in complementary distribution (Section 2.3) (subtypes: 
phonological allomorph, suppletive allomorph). 


alphabetism: an abbreviation consisting of initial letters that are read with 
the letters’ alphabet values, e.g. CD [si:di:] (Section 3.1.5). 


alternation: the differences in pronunciation between two (or more) 
phonological allomorphs (Section 2.3, Chapter 10) (subtypes: automatic 
alternation, morphophonological alternation). 


analogical extension: a kind of analogical change in which an existing 
morphological pattern is applied to a different or new lexeme (Section 6.4.3). 


analogical levelling (= levelling): a kind of analogical change in which a 
word-form is changed based on another word-form belonging to the same 
lexeme (Sections 10.2, 12.2). 


analogy (= analogical change): the use of similar existing words as models 
in the modification and creation of words (Section 6.4.3). 
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analytic language: a language that uses little morphology (Section 1.2) (cf. 
polysynthetic language, synthetic language). 


animacy: a semantic property of nouns that has to do with whether a noun 
denotes a living (or sentient) thing (Section 8.2.1). 


anticausative: an event-changing operation signalling that there is no 
'cause' element and no agent role in the derived event structure (e.g. 
Russian anticausative Tarelka slomalas’. ‘The plate broke.’ vs. Ona slomala 
tarelku ‘She broke the plate’) (Section 11.1.2). 


antipassive: a function-changing operation that backgrounds the patient 
(Section 11.1.3). 


aorist: an inflectional value for verbs of the feature TENSE that indicates the 
occurrence of an action in the past, without indicating whether the action 
is completed. 


applicative: a valence-changing operation that creates a new object 
argument (e.g. German beladen ‘load onto’ requires a direct object, whereas 
laden ‘load’ does not) (Section 11.1.5) (subtypes: recipient applicative, locative 
applicative, benefactive applicative). 


appositional compound: a compound denoting an entity that fulfils several 
descriptions simultaneously, e.g. English student worker ‘worker who is also 
a student’ (Section 7.1). 


argument: a semantic role that is assigned to a noun by the verb (Section 
11.1.1) (cf. adjunct). 


argument inheritance: a deverbal derived word is said to exhibit argument 
inheritance when its argument structure (and function structure) match 
and are dependent upon the argument/function structure of the verbal 
base from which it is derived (Section 11.3.1). 


argument structure (= semantic valence): a verb’s argument structure is 
the set of semantic roles that it assigns (Chapter 11). 


aspect: an inflectional feature of verbs that has to do with the internal 
temporal constituency of an event (values: perfective, imperfective, habitual, 
etc.) (Section 5.1). 


attenuative adjective: a deadjectival adjective that denotes a reduced 
degree of the base (e.g. English bluish from blue) (Section 5.2) (cf. intensive 
adjective). 


augmentative noun: a denominal noun denoting a larger (or otherwise 
pragmatically special) version of the base noun, e.g. Russian borodisca ‘huge 
beard’ is an augmentative of boroda ‘beard’ (Section 5.2). 


automatic alternation: a sound alternation that is purely phonologically 
conditioned (Section 10.1). 
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auxiliary: a verb that co-occurs with a main verb in a phrase to indicate 
values of verbal features such as TENSE or MOOD. 


back-formation: the formation of a shorter, simpler word from a longer 
word that is perceived as morphologically complex (Section 3.2.2). 


bahuvrihi compound - exocentric compound. 


base: the base of a morphologically complex word is the element to which 
a morphological operation applies (Sections 2.2, 3.1.2). 


base modification: a collective term for morphological operations that 
change the pronunciation of part of the base, usually without adding 
segmentable material (Sections 3.1.2, 10.5). 


benefactive applicative: a valence-changing operation that creates a new 
direct object argument for the participant who is the beneficiary of the 
action (Section 11.1.5). 


blend: a lexeme whose stem was created by combining parts of two other 
lexeme stems, e.g. smog from smoke and fog (Section 3.1.5). 


blocking (= synonymy blocking): when the application of a productive 
rule is pre-empted by an existing word with the same meaning (Section 
6.4.2). 


bound form: an element (word-form or affix) that is prosodically dependent 
on its host and cannot stand on its own in a variety of ways (Section 9.3). 


bound stem: a base that is not by itself a word-form and which therefore 
occurs only in combination with another morpheme (Section 2.2). 


case: an inflectional feature of nouns that serves to code the noun phrase's 
semantic role (values: nominative, accusative, genitive, dative, locative, ablative, 
instrumental, etc.) (Section 5.1). 


categorial periphrasis: when a given inflectional value is always expressed 
by a periphrastic (multi-word) expression (e.g. the French future always 
involves a multi-word expression) (Section 8.8). 


category-conditioned degree of productivity: a measure of productivity; 
the ratio of the number of hapax legomena with a given morphological 
pattern to the total sum token frequency of all word-forms with that 
morphological pattern (Section 6.5). 


causative: an event-changing operation referring to an event that is a caused 
version of the base event (Section 11.1.4). 


causative verb: a deverbal verb denoting an action that has caused the 
action represented by the base verb to occur, e.g. Korean cwuk-i- ‘kill’ is the 
causative of cwuk- ‘die’ (Section 5.2). 


322 GLOSSARY 


cell: a position in a paradigm defined by the possible combinations of 
inflectional values (Chapter 2 Exploratory Exercise, Appendix to Chapter 
5). 


circumfix: a discontinuous affix that occurs on both sides of the base 
(Section 2.2). 


citation form: a word-form that is used by convention to refer to a lexeme 
- e.g. when listing a lexeme in a dictionary. 


classifier: a morpheme, usually a lexeme stem, used to classify a noun as 
belonging to a semantically-based group. 


dipping: (a method of forming) a shortened word that does not differ 
semantically from the longer version (Section 3.1.5). 


clitic: a bound word-form — i.e. a word-form that is prosodically dependent 
on a host (Sections 9.2-9.3). 


clitic group: an expression formed by one or more clitics and the host 
(Section 9.3). 


coalescence: a diachronic change whereby two formerly free syntactic 
elements turn into a single word-form (Section 3.2.2) (cf. grammaticalization). 


combinatory potential (- subcategorization frame): the information 
in a lexical entry about the surrounding elements with which a word or 
morpheme can or must combine (Section 3.1.1). 


combining form: a bound stem that occurs only in a compound. 


comparative: an inflectional value of the feature DEGREE (‘having a higher 
degree, more") (Section 5.1). 


competence: the speaker's knowledge of the linguistic system (Section 6.1) 
(cf. performance). 


complementary distribution: when two morphs occur in non-overlapping 
environments, they are in complementary distribution; a partial criterion 
for identifying allomorphs of the same morpheme (Section 2.3). 


complex event noun: a deverbal noun that refers to the event or action itself, 
and which inherits the base verb's argument structure (Section 11.3.2). 


complex word: a word that is one of a group of words that show systematic 
covariation in their form and meaning - ie. morphological structure 
(Section 1.1). 


complexity-based ordering: the idea that restrictions on affix order are 
emergent from the structure of the lexicon: affixes that are more likely to be 
stored in the lexicon together with their stems must occur closer to the root 
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than affixes that are likely to be decomposed during lexical access (Section 
10.4) (cf. level ordering). 


compositional meaning: when the meaning of a complex word is equal 
to the sum of the meanings of its component morphemes (Sections 4.1, 
5.3). 


compound (= compound lexeme): a complex lexeme that is made up of more 
than one lexeme stem (Sections 2.1, 7.1) (subtypes include affix compound, 
appositional compound, coordinative compound, endocentric compound, exocentric 
compound, phrasal compound, synthetic compound). 


compounding: the formation of compounds (Sections 2.1, 3.1.1). 


concatenative operation: an operation that consists of stringing morphemes 
together — i.e. affixation or compounding (as opposed to non-concatenative 
operations such as base modification or reduplication) (Section 3.1.1). 


conceptual structure (= event structure): the formal semantic decomposition 
of a verb's meaning (Section 11.1.1). 


concrete noun: a deverbal noun that is similar to an event noun, but which 
does not refer to the event or action itself; instead, it refers to the product of 
the action (e.g. building), a group of people associated with the action (e.g. 
management), etc. (Section 11.3.2). 


conditional: an inflectional value of the feature moop that indicates a 
hypothetical, unrealized action. 


conditioning: the environments in which different allomorphs of the 
same morpheme occur (Section 2.3) (subtypes: phonological conditioning, 
morphological conditioning, lexical conditioning). 


conjugation: (i) an inflection class of a verb; (ii) verb inflection in general 
(Section 8.2). 


constituent: a subgrouping within the structure of a word or sentence 
(Sections 1.1, 3.2.1). 


contextual inflection: a part of inflectional morphology consisting of 
features that are assigned to a word because of the syntactic context in 
which it appears, i.e. as a result of agreement or government; an inflectional 
feature may be inherent for one word-class and contextual for another 
(Section 5.4.1) (cf. inherent inflection). 


continuative: an inflectional value of the feature AsPECT that indicates an 
ongoing action. 


continuum approach: a hypothesis according to which morphological 
patterns are understood as lying on a continuum ranging from canonical 
inflection to canonical derivation, without sharp boundaries between types 
(Section 5.3) (cf. dichotomy approach). 
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controller (of agreement): in syntax, the constituent whose properties 
determine the properties of the agreeing constituent, e.g, when a 
noun determines the gender property of an adjective that agrees with it, the 
noun is the controller (Section 5.3) (cf. target). 


converb: a verb-form that is used for adverbial subordination (Section 5.1). 


conversion: a morphological pattern in which the pronunciation of the base 
does not change (Section 3.1.4). 


coordination ellipsis: a test for word status; one of two identical elements 
in a coordinated phrase can usually be deleted, but a compound member 
cannot be deleted in this way (Section 9.1). 


coordinative compound (- dvandva compound): a compound that refers 
to multiple referents corresponding to the compound members: e.g. Korean 
elun-ai ‘adult and child’ (elun ‘adult’, ai ‘child’) (Section 7.1). 


coreferential: two nouns are coreferential if they refer to the same entity 
(Section 11.1.2). 


count noun: a noun that can refer to individual entities, and can have both 
singular and plural forms (e.g. English table) (Section 8.2) (cf. mass noun). 


creativity: the creation of neologisms by unproductive patterns (Section 
6.2). 


cross-formation: the formation of a complex word from a base that is itself 
complex, by removing part of the base (Section 3.2.2). 


cumulative expression (= fusion): the expression of multiple morphological 
meanings simultaneously by a single un-analyzable element (Section 4.1). 


dative: an inflectional value of the feature CASE that indicates the indirect 
object of a verb (Section 5.1). 


deadjectival: a formation whose base is an adjective is called deadjectival 
(Section 5.2). 


declarative: an inflectional value of the feature moop that indicates the 
proposition expressed is an unqualified statement of fact. 


declension: (i) an inflection class of a noun; (ii) noun inflection in general 
(Section 8.2). 


decomposition route: a means of lexical access in which a complex word 
is broken up into component morphemes, and information about the word 
(e.g. meaning) is retrieved from the individual morphemes’ lexical entries 
(Section 4.3) (cf. direct route). 


default rule: a default rule is one that applies in the general case, when no 
more specific rule applies (Sections 8.4, 8.6.2). 
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defective: a lexeme is defective if some cells of its inflectional paradigm are 
not filled — i.e. if there are some inflectional meanings that it cannot express 
(Section 8.7.1). 


definite: an inflectional value of the feature DEFINITENESS. 


degree: an inflectional feature of adjectives having to do with comparison 
of gradable properties (values: comparative, superlative). 


degree of exhaustion: a measure of productivity; the ratio of words 
exhibiting a given morphological pattern to all words within the domain of 
that pattern (Section 6.5). 


degree of generalization (- profitability): a measure of productivity; the 
type frequency of the relevant morphological pattern (Section 6.5). 


denominal: a formation whose base is a noun is called denominal (Section 
5.2). 


deobjective: a valence-changing operation in which the patient is removed 
from argument structure, or not linked to any element in function structure 
(Section 11.1.3). 


dependent: an element in a compound or syntactic phrase that modifies the 
head (Sections 7.1-7.3). 


dependent verb: a verb that is confined to a subordinate clause - i.e. to a 
clause that cannot stand alone as a sentence. Dependent verbs are nonfinite 
(Section 5.1). 


deponency: a type of mismatch between form and meaning. Imagine that 
a language has inflectional feature A with value {a}, usually expressed by 
form a, and value [b], usually expressed by form b. A lexeme is deponent if 
it has a paradigm with the form a, but this form expresses the meaning {b}. 
In other words, a deponent verb has inflectional values that are ‘opposite’ 
to what is expected, given its formal marking (Section 8.7.2). 


derivation? (= derivational morphology): the relationship between 
lexemes of a word family; a part of morphology that is characterized by 
relatively concrete morphological meanings, potential semantic irregularity, 
restrictions on applicability, etc. (Section 2.1 and Chapter 5) (Note: derivation 
is closely related neither to derive] nor to derive?!). 


derivation»: the process of deriving] or deriving. 
derivational morphology = derivation], 


derivative: a lexeme that is related to another lexeme by virtue of having 
been derived from it (Section 2.1). 


derive1 (A from B): build or form a complex word A on the basis of a base 
B (Chapter 3). 
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derive» (A from B): construct a (phonological) surface representation A 
by applying a series of modifying rules to an underlying representation B 
(Section 10.4). 


desiderative: a deverbal derivational meaning: ‘want to do’, e.g. Inuktitut 
sinikkuma- ‘want to sleep’ is the desiderative of sini- ‘sleep’ (Section 5.2). 


deverbal: a deverbal lexeme is one whose base is a verb (Section 5.2). 


diachronic: having to do with language change over time (Section 6.1) (cf. 
synchronic). 


diachronic productivity: a measure of productivity; the number of 
neologisms with a given morphological pattern attested over a period of 
time (Section 6.5). 


dichotomy approach: a hypothesis according to which inflectional and 
derivational patterns have sufficiently different traits as to suggest that 
they represent distinct subsystems in morphological architecture (Section 
5.3) (cf. continuum approach). 


diminutive noun: a denominal noun denoting a smaller (or otherwise 
pragmatically special) version of the base noun, e.g. Spanish gatito ‘little 
cat’ is a diminutive of gato ‘cat’ (diminutive adjectives, adverbs and verbs 
are also possible) (Section 5.2). 


direct route: a means of lexical access in which information about a 
complex word (e.g. meaning) is retrieved directly from the lexical entry for 
the complex word (Section 4.3) (cf. decomposition route). 


domain: the domain of a rule is the set of bases to which a rule could apply 
in principle (Sections 6.1, 6.3). 


dual: an inflectional value of the feature NUMBER (‘two’) (Section 5.1). 


dual-processing model: a psycholinguistic model of inflection that 
assumes two completely separate modes of mental processing and storage 
of inflected forms (Section 4.3). 


duplifix: an element attached to the base that consists of both copied 
segments and fixed segments (= a mixture of affix and reduplicant) (Section 
3.1.3). 


durative: an inflectional value of the feature AsPECT that indicates an 
ongoing action or state. 


dvandva compound - coordinative compound. 


economy: a common way of measure the elegance of a description, 
particularly in the context of the lexicon (smaller lexicon = more economical 
description = more elegance) (Section 4.1). 
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elative: an inflectional value of the feature CAsE (‘motion away from’) 
(Chapter 4 Comprehension Exercises). 


elsewhere condition: the principle that more specific conditions apply 
before more general conditions. For instance, if two word-forms are 
compatible with syntactic requirements, the word-form that expresses the 
more specific set of features will be inserted into syntactic structure (Section 
8.6.2). 


empty morph: a morph (generally an affix) that has no meaning but that 
must be posited for the sake of descriptive elegance (Section 4.1). 


enclitic: a clitic that follows its host. 


endocentric compound: a compound that consists of a head and a 
dependent (or several dependents); the meaning of the semantic head is a 
hyponym of the meaning of the entire compound (Section 7.1). 


ergative: an inflectional value of the feature CASE that indicates the agent of 
a transitive verb. 


ergative-absolutive language: a language that uses the same grammatical 
markers to indicate the argument of an intransitive verb and the object of a 
transitive verb, and a different grammatical marker to indicate the agent of 
a transitive verb (cf. nominative-accusative language). 


essive: an inflectional value of the feature CASE that indicates a state of 
being (Chapter 4 Comprehension Exercises). 


event-changing operation: A morphological operation that changes the 
argument structure of a verb (Section 11.1) (cf. function-changing operation). 


event noun (= action noun): a deverbal noun that refers to the event or 
action itself, e.g. English replacement ‘act of replacing’ (Section 11.3.2) 
(subtypes: simple event noun, complex event noun). 


event structure = conceptual structure. 


exclusive first person: refers to a group including the speaker but not 
including the addressee (Chapter 8 Exploratory Exercise) (cf. inclusive first 
person). 


exocentric compound (= bahuvrihi compound): a compound pattern that 
does not contain a (semantic) head and a dependent (Section 7.1). 


expandability: a test for compound status; dependents in phrases can be 
expanded by modifiers (adjectives and adverbs), but dependent members 
in compounds generally cannot (Section 9.1). 


experiencer: a semantic role: the participant that experiences an experiential 
situation (Section 11.1). 
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exponent: when a morphological pattern (e.g. -ed) expresses an inflectional 
feature value (e.g. past tense), itis the exponent of that feature value (Chapter 
12). 


facilitative adjective: a deverbal derivational meaning: 'able to undergo an 
action’ (e.g. Basque jangarri ‘edible’ is the facilitative adjective corresponding 
to jan ‘eat’) (Section 5.2). 


factitive verb: a deadjectival verb with the derivational meaning ‘cause 
something to be Adj’ (e.g. Russian cernit ‘to blacker’ is the factitive of 
cernyj ‘black’) (Sections 5.2-5.3). 


feature = inflectional feature. 


feature-value compatibility: the principle that the morphology must 
provide to the syntax a word-form whose morphosyntactic values do not 
conflict with those required by the syntax (Section 8.6.2) (cf. feature-value 
identity). 


feature-value identity: the principle that the morphology must provide to 
the syntax a word-form whose morphosyntactic values exactly match those 
required by the syntax (Section 8.6.2) (cf. feature-value compatibility). 


feature-value notation: a convention for representing the inflectional 
values associated with a word-form, e.g. [CASE: GENITIVE]. 


female noun: a derivational meaning of nouns (‘female’) — e.g. English 
poetess (derived from poet) (Section 5.2). 


feminine: an inflectional value of the feature GENDER. 


first (person): an inflectional value of the feature PERSON (‘refers to the 
speaker’) (Section 5.1). 


focus: indicates prominent or new information in a discourse. 


formal head: in (endocentric) compounds, the lexeme that serves as the 
morphosyntactic locus and determines word-class, gender and inflection 
class (Sections 7.2-7.4). 


formalist orientation - generative orientation. 
free form: a word-form that is not bound (Section 9.2). 


freedom of host selection: a property of clitics such that they can take 
various syntactic categories as a host (Section 9.3). 


freedom of movement: a test for word status; constituents of a phrase may 
be clefted, topicalized, etc., but constituents of a word may not (Section 9.2). 


fronting: a type of base modification that involves changing the place of 
articulation of a sound (usually a vowel) so that it is pronounced closer to 
the front of the mouth (Section 3.1.2). 
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function-changing operation: A morphological operation that changes the 
way semantic roles are linked to syntactic functions; function-changing 
operations are often encoded by the inflectional feature of voice (Section 
11.1) (cf. event-changing operation). 


function structure (- syntactic valence): the set of syntactic functions of a 
verb's arguments (Section 11.1.1). 


functionalist orientation: an approach to (morphological) research 
that emphasizes system-external explanation (Section 1.3) (cf. generative 
orientation). 


fusion - cumulative expression. 


fusional language: A language that makes a fair amount of use of 
morphology, but in which there is not usually a one-to-one correspondence 
between morphs and meanings (cf. agglutinative language, isolating language). 


future: an inflectional value of the feature TENSE ('occurring later than the 
moment of speech") (Section 5.1). 


gemination: a type of base modification that involves making a sound 
(usually a consonant) longer (Section 3.1.2). 


gender: an inherent property of nouns in some languages that is reflected 
in agreement (by adjectives, verbs and other agreement targets) and that 
serves to group the nouns into classes. Gender is an inflectional feature for 
agreement targets; typical values are masculine, feminine, but sometimes 
simply gender 1, gender 2, etc. 


generality: a common way of measuring the elegance of a description, 
according to which fewer descriptions capturing a larger portion of the 
data each is deemed more general (and thus more elegant) than more 
descriptions capturing a smaller portion of the data each (Section 1.3). 


generative orientation (- formalist orientation); an approach to 
(morphological) research that seeks to discover the principles of Universal 
Grammar, and emphasizes this mode of explanation (Section 1.3) (cf. 
functionalist orientation). 


generic: in a phrase or sentence, an expression is generic if it refers to a 
whole class, rather than a particular item (Section 9.1) (cf. referential). 


genitive: an inflectional value of the feature CASE (‘possessor’, e.g. English 
's in student's book) (Section 5.1). 


Germanic suffixes: A class of suffixes in English that historically were 
mostly inherited from Germanic (Section 6.3.5). 


government: a syntactic relation in which one word requires another word 
or phrase to have a particular inflectional value (e.g. assigning a case value 
to a noun) (Section 5.3) (cf. agreement). 
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gradable adjective: an adjective that is semantically compatible with the 
feature DEGREE (e.g. English happy is gradable because the comparative 
form happier can be formed from it) (Section 5.3). 


grammatical theory: an architecture for the description of grammatical 
structure (Section 1.3). 


grid: a notation convention for representing paradigms (Chapter 2 
Exploratory Exercise, Appendix to Chapter 5). 


habitual: an inflectional value of the feature AsPECT (‘an event that is 
repeated regularly") (Section 5.1). 


hapax (legomenon): a word that occurs exactly once in some corpus 
(Section 6.5). 


hapax-conditioned degree of productivity: a measure of productivity; the 
ratio of the number of hapax legomena with a given morphological pattern 
to the number of hapax legomena with all morphological patterns (Section 
6.5). 


head: see semantic head and formal head. 


head-final language: a language in which syntactic heads are located at 
the end of their phrases (e.g. adpositions come after nouns) (Chapter 7 
Exploratory Exercise). 


head-initial language: a language in which syntactic heads are located 
at the beginning of their phrases (e.g. adpositions come before nouns) 
(Chapter 7 Exploratory Exercise). 


hierarchical structure: the constituent structure of and dominance relations 
between elements in a word (morphemes, bases) or elements in a sentence 
(words, phrases) (Section 3.2.1, Chapter 7). 


homonymy: two word-forms are homonymous if their pronunciation is 
identical (Section 8.6). 


host: a clitic's host is the element that a clitic combines with; a clitic’s 
prosodic host and syntactic host may differ (Section 9.2). 


hypothetical: an inflectional value of the feature Moop (‘counterfactual but 
possible event’). 


idiomaticity = non-compositional meaning. 


imperative: an inflectional value of the feature moop (‘speaker issues 
command to hearer’) (Section 5.1). 


imperfective: an inflectional value of the feature AsPECT (‘an event seen 
from within or as not completed") (Section 5.1). 
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implicational relationship: a predictive relationship such that if A is true, 
B must also true, but not necessarily the reverse; in language typology, 
implicational relationships are usually probabilistic (if A is true, B has a 
high probability of being true), rather than absolute (Chapter 7 Exploratory 
Exercise). 


inalienable possession: indicates a relationship such that the possessor of 
an object cannot be changed; body parts are classic examples of inalienably 
possessed objects (Section 9.1). 


inchoative verb: a deadjectival verb expressing the derivational meaning 
‘begin to be Adj., become Adj.’ (e.g. Spanish verdear ‘become green’ is the 
inchoative of verde ‘green’) (Sections 5.2-5.3). 


inclusive first person: refers to a group including both the speaker and the 
addressee (Chapter 8 Exploratory Exercise) (cf. exclusive first person). 


indeclinable: a lexeme that does not have different inflected forms, and 
differs from other lexemes in this respect (Section 8.2.3). 


indicative: an inflectional value of the feature Moop ('an event thought of 
as occurring in reality’) (Section 5.1). 


inessive: an inflectional value of the feature CAsE (‘inside of’) (Chapter 4 
Comprehension Exercises). 


infinitive: an inflectional meaning of verbs: a nonfinite form used for 
clausal complements when the complement subject is identical to the 
matrix subject; often has nominal properties (Section 5.1). 


infix: an affix that occurs inside the base (Section 2.2). 


inflect: When we say that a word INFLECTS (for some value) we mean that it 
has (inflectional) wonD-ronMs for that value, e.g. ‘Russian verbs inflect for 
gender', ie. Russian verbs distinguish different word-forms for different 
genders (of the subject argument). 


inflection (- inflectional morphology): the relationship between word- 
forms of a lexeme; a part of morphology that is characterized by relatively 
abstract morphological meanings, semantic regularity, almost unlimited 
applicability, etc. (Section 2.1 and Chapter 5). 


inflection class: a class of lexemes that inflect in the same way - i.e. that 
show the same suppletive allomorphy in all word-forms of their paradigm 
(Sections 8.2, Chapter 12). 


(inflection) class shift: a diachronic change in which a lexeme belonging to 
one inflection class becomes associated with a different class (Sections 8.3, 
6.4.3). 


inflectional exponent - exponent. 
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(inflectional) feature (= inflectional category): a set of inflectional values 
that share a semantic property and are mutually exclusive, e.g. TENSE, CASE 
and voice (Section 5.1). 


(inflectional) value (= inflectional property, inflectional feature value): a 
specified value within an inflectional feature — e.g. FUTURE (from the feature 
TENSE), ACCUSATIVE (from the feature CASE), PASSIVE (from the feature 
VOICE) (Section 5.1). 


inherent case: a value of the feature CASE that meets the criteria for inherent 
inflection, e.g. locative, ablative, and instrumental for nouns (Section 5.4.1) 
(cf. structural case). 


inherent inflection: a part of inflectional morphology consisting of features 
that are relevant to the syntax but convey a certain amount of independent 
information; inherent inflection is distinguished from contextual inflection 
by virtue of sharing properties with derivation; an inflectional feature may 
be inherent for one word-class and contextual for another (Section 5.4.1). 


inheritance] hierarchy: a descriptive device in which a tree structure is used 
to represent similarities among inflection classes (i.e. similarities among rule 
schemas). Shared information is specified on higher nodes, and information 
that is specific to individual classes on lower nodes. A lower node inherits 
information from its mother node in the default situation (Section 8.4). 


inheritance? = argument inheritance. 


instrumental: an inflectional value of the feature cast that indicates that the 
marked noun is the means by which an action is accomplished (e.g. Russian 
rezat’ [cut.INF] noZ-om [knife-1Ns] ‘to cut with a knife’) (Section 5.4.1). 


integrated affix: an affix that triggers or undergoes a morphophonological 
alternation, exhibits the phonotactics of a monomorphemic word, and 
tends to occur close to the root (Section 10.4) (cf. neutral affix). 


intensive adjective: a deadjectival adjective signalling an increased degree 
of the base, e.g. Turkish yepyeni ‘brand new’ is derived from yeni ‘new’ 
(Section 5.2) (cf. attenuative adjective). 


interfix: a semantically empty affix that occurs between the two members 
of a N + N compound (especially in German and some other European 
languages) (Section 7.1). 


interlinear (morpheme-by-morpheme) gloss: a notation convention used 
by linguists to help readers understand the structure of a morphological 
example (Appendix to Chapter 2). 


intransitive: a verb that does not take a direct object is called intransitive (cf. 
transitive). 
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isolating language: a language that makes only minimal use of morphology 
(Section 1.2) (cf. agglutinative language, fusional language). 


Latinate suffixes: A class of suffixes in English that historically derive 
mostly from Latin or Greek via large-scale borrowing of vocabulary; 
Latinate suffixes combine (almost) exclusively with Latinate bases (i.e. 
bases of Latin or Greek origin) (Section 6.3.5). 


lengthening: a type of base modification that involves making a sound 
(usually a vowel) longer (Section 3.1.2). 


lenition = weakening. 


level ordering: a hypothesis about grammar architecture, according to 
which affixation rules are separated into two or more levels. These levels 
are ordered relative to each other in derivation,, and within each level, 
affixation rules operate prior to sets of phonological rules that are associated 
with that level (Section 10.4). 


levelling = analogical levelling. 


lexeme: a word in an abstract sense; an abstract concept representing the 
core meaning shared by a set of closely-related word-forms (e.g. lives, live, 
lived) that form a paradigm (Section 2.1). 


lexeme formation = word-formation. 


lexical access: the mental process of looking up a word in the lexicon 
(Section 4.3). 


lexical alternation = morphophonological alternation. 
lexical category = word-class. 


lexical conditioning: when the choice of allomorph is determined by the 
lexeme it attaches to (Section 2.3). 


lexical entry: a listing in the lexicon (Chapter 3). 


(lexical) gang: a group of words that shares an (inflectional) morphological 
pattern and that is also highly similar in phonological form, e.g. sling, wring, 
swing (Section 6.4.3). 


Lexical Integrity Hypothesis (= Lexical Integrity Principle): a hypothesis 
about the universal nature of language, stating that rules of syntax can refer 
and/or apply to entire words or the properties of entire words, but not to 
the internal parts of words or the properties of the internal parts of words 
(Section 9.4). 


lexical item: any item that is listed in the lexicon, ranging from morphemes 
to words to phrases (e.g. idioms) (Chapter 4). 
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lexical neighbourhood: a set of words that are minimally (phonologically) 
different from some other word. 


lexical periphrasis: when a given set of inflectional values is expressed by a 
periphrastic (multi-word) expression for some lexemes, but the same set is 
expressed as a single word for other lexemes (e.g. happier versus periphrastic 
more beautiful — both are comparatives) (Section 8.8). 


lexicon: the list of elements that speakers have to know in addition to the 
rules of grammar (Chapter 4, Section 6.4). 


loanword (- borrowing): a word taken from another language (Section 
8.2.3). 


locative: an inflectional value of the feature case that indicates a location: 
‘in, on, at, by,...’ (e.g. Turkish ev-de [house-loc] ‘in the house") (Section 5.4.1). 


locative applicative: a valence-changing operation that creates a new direct 
object argument for the locational participant (Section 11.1.5). 


masculine: an inflectional value of the feature GENDER. 
masdar: an inflectional action noun (Sections 5.1, 11.4). 


mass noun: a noun that refers to a group of objects as a collective entity, 
rather than as a group of individual member entities; often has a form for 
only one number value (e.g. English information, *informations, or furniture, 
*furnitures) (Sections 5.3, 8.2) (cf. count noun). 


memory strength (- resting activation level): the relative strength of the 
representation of a word or morphological pattern in the lexicon (e.g. as a 
result of token frequency) (Sections 4.3, Chapter 12). 


metathesis: a type of base modification that involves switching the order of 
two contiguous sounds or groups of sounds (e.g. syllables) (Section 3.1.2). 


moderate word-form lexicon: the hypothesis that a speaker's lexicon 
includes simple and complex words, morphemes (or morphological 
patterns), and derived stems. Whether any particular complex word is 
listed in the lexicon depends on a variety of factors (Sections 4.3, 6.4.1) (cf. 
morpheme lexicon, strict word-form lexicon). 


monomorphemic: containing one morpheme. 
monosyllabic: containing one syllable. 


mood: an inflectional feature of verbs that has to do with the speaker's 
level of commitment to the actuality of the event, or its desirability or 
conditionality (values: imperative, subjunctive, indicative, conditional, optative, 
etc.) (Section 5.1). 


morph: a concrete primitive element of morphological analysis. 
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morpheme: the smallest meaningful part of a linguistic expression that 
can be identified by segmentation; a frequently occurring subtype of 
morphological pattern (Section 1.1). 


morpheme alternant = allomorph. 


morpheme-based model: a collective term for approaches to morphological 
analysis in which morphological rules are thought of as combining 
morphemes in much the same way as syntactic rules combine words 
(Section 3.2.1) (cf. word-based model). 


morpheme lexicon: the hypothesis that a speaker's lexicon lists (to the 
extent possible) only simple monomorphemic elements, i.e. roots and 
affixes, rather than complex words (Section 4.1) (cf. moderate word-form 
lexicon, strict word-form lexicon). 


morpheme structure condition: a restriction on the co-occurrence of sounds 
within a morpheme (Sections 4.2, 10.5). 


morphological conditioning: when the morphological context (usually, 
grammatical function) determines the choice of allomorph (Section 2.3). 


(morphological) correspondence: a convention used to represent the 
association between morphologically related sets of words; a morphological 
rule in the word-based model (Section 3.2.2). 


morphological pattern: a pattern across words in which a recurrent aspect 
of meaning corresponds to a recurrent aspect of form (Section 3.1). 


(morphological) rule: a formal description of a morphological pattern 
(Chapter 3). 


morphology: (the study of) systematic covariation in the form and meaning 
of words (Section 1.1). 


morphophonological alternation (- lexical alternation): an alternation 
that is at least partly morphologically or lexically conditioned (Chapter 10). 


morphophonological rule: a formal description of an alternation; a rule 
that derives? a surface representation from an underlying representation 
by changing the shape of the word when certain (morpho)phonological 
conditions are met (Section 2.3). 


morphophonology: the study of morphophonological alternations. 


morphosyntactic feature: a(n inflectional) feature that is relevant to both 
morphology and syntax, such as case. 


morphosyntactic locus: the stem in a compound where morphosyntactic 
features are expressed (Sections 7.2, 7.4). 


morphosyntactic representation: a set of (inflectional) values that can be 
combined with a lexeme and that are relevant to both morphology and 
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syntax; in some theories, the bottom node in syntactic trees consists of 
morphosyntactic representations (i.e. inflectional values + word-class, but 
without specification of a particular lexeme). 


natural class: a group of inflected forms constitute a natural class if they are 
all and the only forms that express some set of inflectional values (Section 
8.6.2). This term has a parallel use in phonology - a group of sounds 
constitute a natural class if they share some phonological feature or set of 
features (e.g. labial) (Section 10.1). 


natural syncretism: a pattern of syncretism in which the syncretic forms 
constitute a natural class (Section 8.6.2). 


negative adjective: a deadjectival adjective signalling a reversal of the 
(positive) quality of the base (e.g. English unhappy, derived from happy). 


neologism: a new lexeme that is attested, but had not previously been 
observed in the language. (Section 42, Chapter 6) (cf. actual word, 
occasionalism, possible word). 


neuter: an inflectional value of the feature GENDER. 


neutral affix: an affix that does nottrigger or undergo a morphophonological 
alternation, may show phonotactic peculiarities, and tends to occur further 
from the root (Section 10.4) (cf. integrated affix). 


nominative: an inflectional value of the feature case (‘the case of the subject, 
the case-form that is used as citation form") (Section 5.1). 


nominative-accusative language: a language that uses the same grammatical 
markers to indicate the argument of an intransitive verb and the agent of a 
transitive verb, and a different grammatical marker to indicate the patient 
of a transitive verb (cf. ergative-absolutive language). 


nonce formation = occasionalism. 


non-compositional meaning (= idiomaticity): see compositional meaning 
(Sections 3.2.2, 5.3, 9.1). 


non-concatenative operation: a morphological operation that cannot 
be straightforwardly described as stringing together of two morphemes 
(Sections 3.1.2-3.1.4). 


non-gradable adjective: see gradable adjective (Section 5.3). 


non-word: a sequence of sounds that adheres to the phonological rules of a 
language, and therefore sounds like a word, but has no meaning (Chapter 
3 Exploratory Exercise). 


noun incorporation: N V compounding in which the verb is the head; 
found especially in polysynthetic languages (Sections 7.1, 9.1, 11.2.1). 
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NP - noun phrase. 


number: an inflectional feature of nouns, having to do with the number of 
items a noun refers to (values: singular, plural, dual, paucal, etc.) (Section 5.1). 


objective case: an inflectional value of the feature CASE that indicates an 
object role (Section 8.5). 


oblique1: oblique cases are all morphological cases apart from the most 
basic ones (e.g. all but the nominative and accusative). 


oblique»: syntactic functions other than the subject and direct object; the 
term is often used even when these functions are expressed by syntactic 
means, rather than morphological case (Section 11.1.1). 


occasionalism (= nonce formation): a neologism that has not caught on 
and is restricted to occasional occurrences (Sections 4.2, 6.1) (cf. actual word, 
neologism, possible word). 


optative: an inflectional value of the feature moon that indicates a desire or 
wish for some event to occur. 


palatalization: a type of base modification that involves changing the place 
of articulation of a sound (usually a consonant) so that it is pronounced at 
the palate (Section 3.1.2). 


paradigm: the structured set of word-forms of a lexeme (Section 2.1). (Often 
subsets that belong together (e.g. all past-tense forms of a verb) are also 
referred to as paradigms.) 


paradigm rule: a word-based rule consisting of multiple correspondences 
between word-forms in an inflectional paradigm (Section 8.3). 


paradigmatic gap: the ^missing' word-form in a defective lexeme. 


paradigmatic periphrasis: when a given set of inflectional values is always 
expressed by a periphrastic (multi-word) expression, but individual values 
in the set are expressed by a single-word expression elsewhere in the 
paradigm (e.g. in Latin, the combination of perfect and passive is always 
periphrastic, but the combination of perfect and active is not periphrastic) 
(Section 8.8). 


paradigmatic relations: relations between units that could (potentially) 
occur in the same slot (Section 8.1). 


parsing ratio: the percentage of words with a given morphological pattern 
that are decomposed (parsed) in lexical access, according to a parallel dual 
route model of lexical access (Section 6.4.1). 


participle: a deverbal adjective that may retain some verbal properties 
(Sections 5.1, 11.4). 
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partitive: an inflectional value of the feature case that denotes a subpart of 
a collective entity (Chapter 4 Comprehension Exercises). 


passive: an inflectional value of the feature voice that signals that the 
patient is the grammatical subject; a function-changing operation in which 
the agent is backgrounded (Sections 5.1, 11.1.2). 


past: an inflectional value of the feature TENSE (‘occurring earlier than the 
moment of speech’) (Section 5.1). 


patient: a semantic role; the participant that undergoes an action (Chapter 
11). 


patient noun: a deverbal noun that refers to the verb's patient, e.g. English 
invit-ee (derived from the verb invite, and indicating the person who is the 
recipient of the action) (Section 5.2). 


paucal: an inflectional value of the feature NUMBER (‘a few’) (Section 5.1). 


perfect: an inflectional value of the feature AsPEcr (‘an event that took place 
in the past but has current relevance"). 


perfective: an inflectional value of the feature AsPECT ('an event seen from 
the outside or as completed’) (Section 5.1). 


performance: use of language (Section 6.1) (cf. competence). 


periphrastic construction: a multi-word phrase that cumulatively expresses 
some set of inflectional values. A periphrastic construction fills a cell of 
an inflectional paradigm (Sections 8.8, 9.4) (subtypes: lexical periphrasis, 
paradigmatic periphrasis, categorial periphrasis). 


person: an inflectional agreement feature of verbs (person of subject or 
object) and nouns (person of possessor) (values: 1st, 2nd, 3rd) (Section 5.1). 


phonetic alternation = automatic alternation. 


phonological allomorph: two allomorphs are phonological if they have 
quite similar phonological shape (Section 2.3, Chapter 10). 


phonological conditioning: when the phonological context determines the 
choice of allomorph (Section 2.3). 


phrasal compound: a compound in which an entire phrase is the dependent 
member, e.g. a [[floor of the birdcage] taste] (Chapter 9 Exploratory Exercise). 


phrasal node: a non-terminal node in a syntactic tree diagram, representing 
a phrasal constituent (Section 7.4). 


phrase structure rule: in syntax, a rule stating how words may be combined 
to form a phrase (Section 3.2.1). 


plural: an inflectional value of the feature NUMBER (‘more than one’) 
(Section 5.1). 
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poetic licence: the creation of neologisms by unproductive rules in a way 
that violates ordinary language norms (Section 6.2). 


polarity: an inflectional feature of verbs that indicates the positive or 
negative status of the event (e.g. Japanese kir-u [cut-Pns] ‘cuts’, kir-ana-i 
[cut-NEG-PRS] ‘doesn’t cut’) (Section 5.1). 


polysyllabic: containing multiple syllables. 


polysynthetic language: a language that makes very extensive use of 
morphology (Section 1.2) (cf. analytic language, synthetic language). 


portmanteau morph: an affix or stem that cumulatively expresses two 
meanings that would be expected to be expressed separately (Section 4.1). 


possible word (= potential word): a lexeme that could be formed according 
to word-formation rules but is novel and perhaps never used before 
(Sections 4.2) (cf. actual word, neologism, occasionalism). 


postposition: similar to a preposition, except that postpositions are 
syntactically positioned after noun phrases rather than before them 
(Chapter 7 Exploratory Exercise). 


potential word = possible word. 
prefix: an affix that precedes the base (Section 2.2). 


present: an inflectional value of the feature TENSE (‘occurring simultaneously 
with the moment of speech’) (Section 5.1). 


Priscianic formation: the formation of an inflected form on the basis of 
another inflected form that is not closely related semantically (Section 8.5). 


privative adjective: a denominal adjective signalling lack of possession 
of the base noun (N-Pniv ‘lacking N’, e.g. Russian bezgolosyj ‘voiceless’, 
derived from the noun golos ‘voice’) (Section 5.2). 


proclitic: a clitic that precedes its host. 


productivity: a morphological pattern or rule is productive if it can be 
applied to new bases to create new words (Section 4.2, Chapter 6). 


profitability = degree of generalization. 


progressive: an inflectional value of the feature of AsPEcT (‘an event that is 
in progress’). 

proportional equation: a way of representing the relationship between the 
model and the target in analogical change (Section 6.4.3). 


proprietive adjective: a denominal adjective signalling possession of the 
base noun (‘having N’, e.g. Hungarian nagy hatalm-ií uralkodó [great power- 
PROPR monarch] ‘monarch having great power’) (Sections 5.2, 11.4). 
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prosodic dependence: a property of clitics and affixes; a prosodically 
dependent element is smaller than a prosodic word and must 'lean on' a 
host (Section 9.2). 


prosodic word: a unit that acts as the domain for word stress (Section 9.2). 


prototypical member: in a lexical gang, one or more word-forms that 
exhibit the most specific set of phonological properties relevant to the gang; 
the 'centre' of the gang (Section 6.4.3). 


pure stem: a stem that is not identical to any member of the inflectional 
paradigm (Section 7.1). 


purposive: a derivational meaning expressing purpose or goal-oriented 
action. 


quality noun: a derivational meaning of deadjectival nouns (e.g. English 
goodness from good) (Section 5.2). 


recipient applicative: a valence-changing operation that creates a new 
object argument for the recipient participant (Section 11.1.5). 


reduplicant: the copied element in a reduplication (Section 3.1.3). 


reduplication: a formal operation whereby (part of) the base is copied and 
attached to the base (Section 3.1.3). 


referential: in a phrase or sentence, an expression is referential if it refers to 
some specific, existing entity (Section 9.1) (cf. generic). 


reflexive: a primarily function-changing operation signalling that agent 
and patient are coreferential (Section 11.1.2). 


relational adjective: a denominal adjective signalling some kind of relation 
to the base noun, e.g. Russian korolevskij ‘royal’ is the relational form derived 
from korol’ ‘king’ (Section 5.2). 


relative clause: a subordinate clause that modifies a noun (e.g. ‘the man 
who ate everything’). 


relic alternation: an instance of allomorphy that occurs in very few words 
and is not productive. Typically, relic alternations were productive at an 
earlier stage of the language, but subsequently levelled in all but a few 
(high frequency) words (Section 10.2). 


repetitive: a deverbal derivational meaning of verbs: ‘again’ (e.g. English 
rewrite, derived from write) (Section 5.2). 


restrictions = selectional restrictions. 


resultative: an event-changing operation signalling that there is no ‘cause’ 
and ‘become’ element in the event structure (Section 11.1.2). 
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reversive: a deverbal derivational meaning of verbs: 'reverse or undo the 
effect of the base verb’ (e.g. English unfasten, derived from fasten) (Section 
5.2). 


root: a base that cannot be analyzed further — i.e. a base that consists of a 
single morpheme (Section 2.2). 


rule = morphological rule. 


rule of referral: an inflectional rule stating that two cells in a paradigm 
have the same phonological form (Section 8.6.3). 


rule schema: a schema that generalizes over several different morphological 
rules that exhibit similarities (Section 8.4). 


schema = word-schema. 


second (person): an inflectional value of the feature PERSON (‘refers to the 
addressee’) (Section 5.1). 


second-position clitic (= Wackernagel clitic): a well-known type of special 
clitic; second-position clitics appear after the first element (word or syntactic 
constituent) of a simple sentence (Section 9.3). 


segment: to break-up complex words into individually meaningful parts 
(Chapter 2). 


selectional restrictions (= restrictions): conditions that define the domain of 
arule, e.g. some affixes ‘select’ only bases that have a particular phonological 
shape, particular semantic meaning, etc. (Sections 3.1.1, 6.1, 6.3). 


semantic head: the semantic head of a compound or a syntactic phrase is 
the hyponym of the whole expression (Sections 7.1-7.2). 


semantic role (- thematic relation): the semantic role of a noun phrase is 
the manner in which the noun participates in the action or state expressed 
by the verb, e.g. as agent, patient, theme, etc. (Chapter 11). 


semantic scope: when an affix C combines with a complex unit AB, C 
has semantic scope over AB; the meaning of ABC should be equal to the 
meaning of AB + the meaning of C (which crucially may be different than 
the meaning of A + the meaning of BC) (Section 7.3). 


semantic valence = argument structure. 


separability: a syntactic test of word status; phrasal constituents can be 
separated by other words, but words cannot intervene between parts of a 
word (incl. compound members) (Section 9.1). 


shortening: a type of base modification that involves making a sound 
shorter (Section 3.1.2). 
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simple clitic: a clitic that can appear in the same syntactic positions as a free 
form of the same word-class (Section 9.3) (cf. special clitic). 


simple event noun: a deverbal noun that refers to the event or action itself, 
but which does not preserve the base verb's argument structure (Section 
11.3.2). 


single-component hypothesis: a hypothesis about the architecture of the 
linguistic system according to which morphological rules are collected 
together into a single grammatical component that applies before, or in 
parallel to, syntax (Section 5.4.2) (cf. split-morphology hypothesis). 


singular: an inflectional value of the feature NUMBER (‘one’) (Section 5.1). 


source: a semantic role: the participant that is the initiating location or state 
for the action (Section 11.1.1). 


special clitic: a clitic whose syntactic distribution differs from that of free 
forms of the same word-class, and must be described in its own right 
(Section 9.3) (cf. simple clitic). 


split-morphology hypothesis: a hypothesis about the architecture of the 
linguistic system according to which morphology is divided between two 
grammatical components: word-formation rules apply before syntactic 
rules, whereas inflectional rules apply after syntactic rules (Section 5.5.1) 
(cf. single-component hypothesis). 


stative verb: a verb with the semantic property of referring to a state of 
existence, rather than a physical action (e.g. be is a stative verb in English) 
(Section 5.3). 


stem: the base of an inflected word-form (Section 2.2). 


stimulus: a semantic role: the participant that represents the content of the 
experiencer's experience (Section 11.1.1). 


stimulus»: in psycholinguistics, a test item presented to a participant during 
the course of an experiment. 


stress shift: a type of base modification that involves changing the syllable 
in a word with which primary stress is associated (Section 3.1.2). 


strict word-form lexicon: the hypothesis that a speaker's mental dictionary 
lists both simple and complex words, regardless of whether the complex 
words have predictable meaning or form, but does not list affixes (Section 
4.2) (cf. moderate word-form lexicon, morpheme lexicon). 


strong suppletion: a kind of allomorphy in which allomorphs of the same 
morpheme are phonologically radically different (Section 2.3). 
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structural case: a value of the feature case that meets the criteria for 
contextual inflection, e.g. nominative, accusative, and genitive for nouns 
(Section 5.4.1) (cf. inherent case). 


structure preservation: the property of morphophonological alternations 
of not introducing new segments (Section 10.1). 


subcategorization frame = combinatory potential. 


subjunctive: an inflectional value of the feature moop (‘a non-realized 
event in a subordinate clause’) (Section 5.1). 


subtraction: a type of base modification that consists in deleting a segment 
(or more than a segment) from the base (Section 3.1.2). 


suffix: an affix that follows the base (Section 2.2). 


superlative: an inflectional value of the feature DEGREE (‘highest degree, 
most’) (Section 5.1). 


suppletion: a kind of allomorphy in which two allomorphs of the same 
morpheme are not similar in pronunciation (Sections 2.3, 8.2) (subtypes: 
strong suppletion, weak suppletion). 


suppletive allomorph: two allomorphs are suppletive (= show suppletion) 
if they are not similar in pronunciation (i.e. cannot be related to each other 
by (morpho)phonological rules) (Sections 2.3, 8.2). 


surface representation: a word-formas itis actually pronounced by speakers; 
a form derived? from an underlying representation by morphophonological 
rules (rules of derivational phonology) (Sections 2.3, 10.3). 


synchronic: having to do with language at a given point in time (Section 
6.1) (cf. diachronic). 


syncretism: systematic homonymy of inflected words in a paradigm 
(Sections 8.6, 12.1.3). 


synonymy blocking = blocking. 
syntactic agreement = agreement. 


syntactic function: the syntactic function of a noun phrase is the way in 
which the noun phrase’s semantic role is encoded in syntactic structure, e.g. 
as subject, object, oblique (Chapter 11). 


syntactic government = government. 
syntactic valence = function structure. 


syntagmatic relations: relations between units that (potentially) follow 
each other in speech (Section 8.1). 
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synthetic compound: an N-N compound with a deverbal nominal head 
and with a dependent noun that fills an argument position in the head's 
valence (Section 11.2.3). 


synthetic language: a language that uses a fair amount of morphology 
(Section 1.2) (cf. analytic language, polysynthetic language). 


target (for agreement): in syntax, the constituent whose properties 
are determined by the properties of another constituent, e.g. when a noun 
determines the gender property of an adjective that agrees with it, the 
adjective is the target (Section 5.3) (cf. controller). 


tense: an inflectional feature of verbs that has to do with the temporal 
location of the verbal event, especially with respect to the speech event 
(values: present, future, past, etc.) (Section 5.1). 


thematic relation - semantic role. 


theme]: a semantic role: the participant that undergoes a movement or 
other change of state (Section 11.1.1). 


theme»: an older term for 'stem'. 


third (person): an inflectional value of the feature PERSON ('refers to a third 
party’) (Section 5.1). 


token frequency: a count of how frequently some structure (word-form, 
morpheme, lexeme, etc.) is used in some sample of language (Section 4.3, 
Chapter 12) (cf. type frequency). 


tonal change: a type of base modification that involves changing the tone 
pattern of a word (Section 3.1.2). 


transitive: a verb that takes a direct object is called transitive. Transitivity is 
the property of being either transitive or intransitive (Section 8.2.1). 


transposition: change of word-class by a morphological operation (Sections 
11.3-11.4). 


tree diagram: in syntax and morphology, a convention for representing 
hierarchical constituent structure (Sections 3.2.1, 7.2-7.4). 


type frequency (of a morphological pattern): the number of lexemes that 
exhibit a given morphological pattern is that pattern's type frequency 
(Chapter 3 Exploratory Exercise; Section 6.5) (cf. token frequency). 


underlying representation: an abstract representation that is not actually 
used by speakers, but that linguists postulate to simplify the rule system; 
morphophonological rules (rules of derivational phonology) operate 
on underlying representations to produce actually pronounced surface 
representations (Sections 2.3, 10.3). 
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underspecification (of inflectional value): a descriptive device in which a 
word-form or morpheme is not specified for a given inflectional value; used 
to formally describe some types of syncretism, esp. natural syncretisms 
(Section 8.6.2). 


Universal Grammar: the innate part of speakers' grammatical knowledge 
(Section 1.3). 


unproductive pattern: see productivity. 
usual word - actual word. 


valence: information about the semantic roles and syntactic functions of a 
verb (or sometimes another word-class) (Chapter 11). 


value - inflectional value. 


voice: an inflectional (and sometimes derivational) feature of verbs that 
indicates a function-changing operation (values: active, passive, reflexive, 
antipassive) (Section 5.1). 


voicing: a type of base modification that involves changing an unvoiced 
sound to its voiced counterpart (Section 3.1.2). 


vowel harmony: a phonological assimilation process that constrains the co- 
occurrence of vowels in the same word-form or lexeme, e.g. in Chukchi, the 
vowels in a word-form must be either all [i,e,u], or all [5,a,0] (Section 9.1). 


VP - verb phrase. 
Wackernagel clitic = second-position clitic. 


weakening (- lenition): a type of base modification that involves changing 
the manner of articulation of a sound so that it is produced with less 
constriction (e.g. stops become fricatives, or fricatives become sonorants) 
(Section 3.1.2). 


weak suppletion: a kind of allomorphy in which allomorphs of the same 
morpheme are not radically different phonologically, but neither are they 
similar enough to be related by (morpho)phonological rules (Section 2.3). 


word: a word-form or a lexeme, or less commonly, a word token (Section 
2.1, Chapter 9). 


word-based model: a collective term for approaches to morphological 
analysis in which the fundamental significance of the word is emphasized 
and the relationship between complex words is captured by formulating 
word-schemas that represent common features (Section 3.2.2) (cf. morpheme- 
based model). 


word-class: a category of words such as ‘nour’, ‘verb’, ‘adjective’, ‘adverb’, 
etc. (Section 3.1.1). 
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word family: a set of morphologically related lexemes (Section 2.1). 


word-form: a word in a concrete sense; a sequence of sounds that express 
the combination of a lexeme and a set of grammatical meanings; a word- 
form can be isolated from surrounding elements in speech because it is 
either prosodically independent (- a free form) or a clitic and not an affix 
(Section 2.1, Chapter 9). 


word-formation (= lexeme formation): derivation and compounding 
(Section 2.1). 


word-schema: a representation of a set of morphologically related words 
(Section 3.2.2). 


word-structure rule: in the morpheme-based model, a rule stating how 
morphemes may combine to form a word (Section 3.2.1). 


word token: an instance of use of a word in some text or in speech (Section 
2.1, Chapter 9 intro). 


zero affix (= zero expression): an inflectional value is said to be expressed 
by zero if there is nothing in the pronunciation that corresponds to the 
value, so that the presence of the value’s meaning must be inferred from this 
absence of form. (In derivational morphology, morphologists usually talk 
about conversion rather than zero expression, though there is really no deep 
difference); a device used within the morpheme-based model to preserve 
the principle of having only concatenative rules (Sections 3.2.1, 4.1). 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Afrikaans Indo-European, southern Africa afr 98 
West Germanic 
Ainu Isolate northern Japan ain 242-243, 
(Hokkaido) 246 
Akkadian Semitic ancient Mesopotamia akk 1 
Albanian Indo-European Albania sqi 28, 35, 36, 
45, 154, 229 
Alutor Chukotko- Russian Far East alr 138 
Kamchatkan (Kamchatka 
Peninsula) 
Arabic Afro-Asiatic, Middle East, arb 9, 20, 36, 87, 
Standard Semitic northern Africa 95-96, 101, 
149, 160, 
162, 177- 
178, 270 
Armenian Indo-European Armenia hye 160, 239 
(Eastern) 
Arrernte Pama-Nyungan central Australia aer 86, 154 
(Mparntwe) 
Basque Isolate Spain and France eus 89, 253, 328 
Bella Coola Salishan central British blc 21 


Columbia coast 


ISO 639-3 is an international standard for identifying the world's languages. Every 
language is assigned a unique three-letter code. The reader is encouraged to use these 
codes to find more information about the languages discussed in the book; the online 
language encyclopaedia Ethnologue (www.ethnologue.com) is helpful in this regard. 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Big Nambas Austronesian, Vanuatu nmb 86 
Oceanic 
Bulgarian Indo-European, Bulgaria bul 275-276 
South Slavic 
Capanahua Panoan Peru kaq 145 
Chamorro Austronesian, Guam cha 243 
Malayo- 
Polynesian 
Chichewa Niger-Congo, Malawi nya 237-239, 
Benue-Congo 244 
Bantu 
Chinese Sino-Tibetan China cmn 12, 121- 
(Mandarin) 122, 189, 
247—249, 
263 
Chukchi Chukotko- Russian Far East ckt 193, 345 
Kamchatkan (Kamchatka 
Peninsula) 
Clallam Salishan NW United States clm 37-38 
Coptic Afro-Asiatic Egypt (extinct, still cop 55, 64 
used as liturgical 
language of the 
Coptic church) 
Croatian Indo-European, Croatia, Bosnia and hrv 57-59, 196 
South Slavic Herzegovina 
Dutch Indo-European, Netherlands, nld 69, 98, 101, 
West Germanic Belgium 198-200, 
209, 271 
English Indo-European, England, USA, etc. eng 2-12, 14-26, 
West Germanic 30-31, 34- 
35, 37-51, 
53, 56, 60- 
64, 67, 69, 
71-73, 76- 
77, 81-89, 
91, 93-96, 
98-100, 
104, 107, 
109-112, 
115-119, 
121-125, 
127-131, 
133-135, 
137-149, 


151-152, 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 


English (contd) 156-157, 
163-164, 
166, 176, 
180-181, 
183, 
185-186, 
189-198, 
200, 207, 
209-216, 
224-227, 
230, 
232-233, 
235-237, 
245-246, 
249-258, 
262-264, 
272, 275, 
319-321, 
324, 
327-330, 
333-334, 
336, 338, 
340-342 
Even Altaic, Tungusic Siberia eve 277 
Evenki Altaic, Tungusic China, Mongolia, evn 154 
Asian Russia 
Finnish Uralic, Finno Finland fin 64, 66, 76, 
-Ugric, 153, 198- 
Balto-Finnic 199 
French Indo-European, France, Quebec, etc. fra 56, 86, 94, 
Romance 127, 165, 
180, 183, 
189, 191, 
196, 198, 
208, 227, 
229, 266- 
267, 278, 
321 
Frisian (West) Indo-European, northern Netherlands fry 273 
West Germanic 
Gaelic Indo-European, Scotland gla 36 
(Scottish) Celtic 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 


German Indo-European, Germany etc. deu 6, 20, 23-24, 
West Germanic 34-36, 55, 
63, 66-67, 
69, 87-89, 
95-96, 98, 
101, 104, 
117, 119- 
120, 124, 
126, 130, 
139, 142, 
144, 160, 
174-175, 
189-192, 
209, 211- 
219, 221, 
229-230, 
242, 258- 
261, 272, 
320, 332 
Godoberi North Caucasian, Daghestan (Russia) gdo 271 
Andic 
Gothic Indo-European, Balkan peninsula got 269 
East Germanic 
Greek Indo-European Greece grc 2, 4, 140, 
(Ancient, 185-186, 
Classical) 333 
Greek Indo-European Greece, Cyprus ell 16, 30-31, 
(Modern) 62, 133, 
167-172, 
182, 233, 
262 
Greenlandic Eskimo-Aleut Greenland kal 5-6, 88, 240 
(West) 
Guarani Tupi-Guarani Paraguay grn 208, 245 
Hausa Afro-Asiatic, Nigeria, Niger hau 55, 193, 
West Chadic 232-233, 
271 
Hebrew Afro-Asiatic, Israel heb 7, 13, 120, 
(Modern) Semitic 213-216, 
220, 229, 
255, 260 
Hindi Indo-European, India, Pakistan, hin 36-37, 48, 86 
Indic, Nepal, etc. 
Western Hindi 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Hungarian Uralic, Finno- Hungary, Romania, hun 7, 203-204, 
Ugric Moldova 258-260, 339 
Igbo Niger-Congo, Nigeria ibo 82 
Benue-Congo 
Indonesian Austronesian, Indonesia ind 219, 253 
Malayo- 
Polynesian, 
Malay 
Irish Indo-European, Ireland gle 158, 229 
Celtic 
Italian Indo-European, Italy, Switzerland ita 21, 30, 69, 
Romance 92, 96, 125- 
126, 140- 
141, 146, 
162-165, 
180, 190, 
230, 253, 
275-276 
Japanese Isolate Japan jpn 27, 29, 86- 
87, 121-122, 
152, 194, 
212-216, 
241, 247- 
249, 253, 
263, 339 
Kannada Dravidian, Karnataka kan 122 
Southern (southern India) 
Khanty Uralic, Finno- western Siberia kca 268—269 
Ugric, 
Ugric 
Kikuyu Niger-Congo, Kenya kik 154 
Benue-Congo, 
Bantu 
Kobon Trans-New Papua New Guinea kpw 268-269 
Guinea, 
Madang 
Korean Isolate Korea kor 22-23, 26, 
86, 88, 141, 
159, 321, 
324 
Koromfe Niger-Congo, northern Burkina Faso kfz 274-275 
North, Gur 
Krongo Nilo-Saharan, ^ northern Sudan kgo 63-64 
Kadugli- 


Krongo 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Lakhota Siouan North and South ]kt 191-192 
Dakota 
Lango Nilo-Saharan, ^ southern Sudan Ino 190-192, 
Eastern 275-276 
Sudanic, 
Nilotic 
Latin Indo-European, Italy lat 2, 16-19, 
Italic 28-29, 
83-84, 92, 
94, 96, 98, 
103-105, 
110, 122, 
159-160, 
165-167, 
172-173, 
183-186, 
227, 266, 
270-271, 
273-274, 
276, 318, 
333, 337 
Lezgian East Caucasian, southern Daghestan lez 5-6, 64-66, 
Lezgic, (Russia) and 110, 160, 
Nakh- northern 223-224, 
Daghestanian Azerbaijan (eastern 230, 258- 
Caucasus) 260 
Lithuanian Indo-European, Lithuania lit 176-178, 
Baltic 208 
Malagasy Austronesian, | Madagascar mlg 38 
Malayo- 
Polynesian, 
Barito 
Malay Austronesian, Malaysia, Indonesia zim 121 
Malayo- 
Polynesian, 
Malayic 
Mangap-Mbula Austronesian, ^ Papua New Guinea mna 38-39 
Malayo- 
Polynesian, 
Oceanic 
Mbay Nilo-Saharan, | Chad myb 55 
Central 
Sudanic 
Mixtec Oto-Manguean, Oaxaca (Mexico) mig 37,154 
(Chalcatongo) Mixtecan 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Murle Nilo-Saharan, southern Sudan mur 37 
Eastern 
Sudanic, 
Surmic 
Nahuatl Uto-Aztecan central Mexico nci 19-20, 91, 
(Classical) Aztecan 110 
Nahuatl Uto-Aztecan, Mexico nhe 246 
(Huauhtla) ^ Aztecan 
Ndebele Niger-Congo, Zimbabwe, Botswana, nde 76 
Benue-Congo, Zambia 
Bantu 
Old Church Indo-European, liturgical language chu 179 
Slavonic Slavic of various Eastern 
European churches 
Old English Indo-European, England ang 6, 53, 158, 
West Germanic 176, 268, 
275-276 
Old High Indo-European, Germany goh 179-180, 
German West Germanic 214-215, 
217-218, 
221, 272 
Ossetic Indo-European, northern Caucasus oss 161 
Iranian (Russia and Georgia) 
Paiute Uto-Aztecan, Western United States pao 154 
(Northern) Numic 
Persian Indo-European, Iran, Afghanistan fas 26, 172 
Iranian 
Pipil Uto-Aztecan, El Salvador ppl 277 
Aztecan 
Pitjantjatjara Pama-Nyungan Southern Australia pit 200-201 
Polish Indo-European, Poland pol 90-91, 127, 
West Slavic 146, 199, 
207, 219, 
228-229, 
233 
Ponapean Austronesian, Micronesia pon 38-39, 89, 
Malayo- 192-193, 
Polynesian, 193, 199, 
Oceanic 208 
Quechua Quechuan, Peru qub 36-37, 98, 
(Huallaga) Quechua I 100, 159, 
318 
Romanian Romance Romania ron 183 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 


Russian Indo-European, Russia etc. rus 18-20, 
East Slavic 23-25, 70, 
87-89, 94, 
97, 100, 117, 
119-120, 
134-136, 
146, 148, 
151-152, 
163-165, 
180-181, 
186, 212- 
213, 215- 
216, 238- 
239, 253, 
257, 266, 
269, 279, 
320, 328, 
331-332, 
339-340 
Sanskrit Indo-European, India san 6, 69, 109, 
Indic 121-122, 
138, 151, 
266 
Serbian Indo-European, Serbia, etc. srp 63, 66, 
South Slavic 76-77, 96, 
110, 201- 
202, 208 
Somali Afro-Asiatic, Somalia som 29-30, 39, 
Cushitic 48 
Sorbian (Upper) Indo-European, Germany hsb 97, 205, 266 
West Slavic 
Spanish Indo-European, Spain, Latin America spa 25-26, 52, 
Romance 87-89, 
92-93, 99, 
106-110, 
118, 139, 
141-142, 
144, 146, 
151, 182, 
189, 198, 
208, 218, 
233, 262, 
272, 326, 
331 
Sumerian Isolate ancient Mesopotamia sux 1,4-5 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Swahili Niger-Congo, Tanzania etc. swh 5-6, 84-85, 
Benue-Congo, 88, 92 
Bantu 
Swedish Indo-European, Sweden swe 7, 85, 88 
North 
Germanic 
Tagalog Austronesian, northern Philippines tgl 20, 55-56, 
Malayo- 88, 161, 185 
Polynesian, 
Phillipine 
Tamil Dravidian, India, Sri Lanka tam 159-160 
Southern 
Tauya Trans-New Papua New Guinea tya 278 
Guinea, 
Madang 
Tibetan Sino-Tibetan, Tibet xct 141-142 
(Classical) Tibeto- 
Burman, 
Tibetic 
Tiv Niger-Congo Nigeria tiv 65 
Benue-Congo, 
Bantu 
Tiwa (Southern) Kiowa-Tanoan New Mexico tix 204-205 
Tsimshian Penutian northern coast of tsi 190 
(Coast) British Columbia 
Tümpisa Uto-Aztecan, California, Nevada par 173-174 
Shoshoni Numic 
Turkish Altaic, Turkic, Turkey tur 22, 28, 
Southern 68-69, 89, 
95, 100, 
106, 159, 
193, 213, 
215-216, 
219, 233, 
332, 334 
Tzutujil Mayan Guatemala, Mexico tzj 7, 13, 28, 
39, 89, 240, 
267—268 
Udi East Caucasian, Azerbaijan udi 205 
Lezgic 
Udmurt Uralic, Finno- central Russia udm 267, 277, 
Ugric, 318 


Permic 
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Language Family Geographical ISO 639-3 Page 
area code! numbers 
Urdu Indo-European, Pakistan, India, urd 36-37, 48, 86 
Indic, Nepal, etc. 
Western Hindi 
Vietnamese Austro-Asiatic, | Vietnam vie 4-6, 82 
Mon-Khmer 
Welsh Indo-European, Wales cym 160-161, 
Celtic 164-165, 
271, 274, 
276 
Yimas Sepik-Ramu Papua New Guinea yee 55, 91 
Yoruba Niger-Congo, Nigeria yor 4-7 
Benue-Congo, 
Defoid 
Zulu Niger-Congo, southern Africa zul 162, 221- 
Benue-Congo, 222, 233 


Bantu 


Subject index 


Page numbers in bold are pages where the term in question is in bold within 


the text. 


acceptability judgements, 129, 134, 136, 
318 
acquisition, 122 
acronyms, 40, 318 
action nouns, 67, 87, 99-100, 120, 133, 
318 
valence inheritance, 251—255, 258, 
263 
See also event nouns 
actual words, 71, 114, 129-130, 134, 318 
adjectives 
derivational types, 89; attenuative, 
320; facilitative, 86, 328; 
intensive, 332; privative, 339; 
proprietive, 258-260, 339; 
relational 89, 340 
gradable, 93, 330 
inflectional features and agreement, 
82, 85, 91-93, 101 
participles, relation to, 257-261 
valence inheritance, 255-257 
adjuncts, 243, 318 
adpositions, 82, 152, 318 
adverbs, 238, 262 
affix compounds, 252, 319 
affix ordering 
complexity-based ordering 
hypothesis, 227, 322 
level ordering hypothesis, 222-223, 
226—228, 231—232, 333 


affixes, 19-22, 39, 54, 230, 319 
combinatory potential, 34-35, 43, 47, 
322 
diachronic sources of, 52, 202-203 
integrated and neutral, 223—227, 231, 
332, 336 
types of, 20; circumfixes, 322; 
duplifixes, 39, 326; infixes, 331; 
interfixes, 139, 332; prefixes, 34— 
35, 39, 227, 339; suffixes, 20-21, 
34, 343 
agents, 85, 234—243, 247—249, 319 
agent-adding operations, 241-242 
agent-backgrounding operations, 
237-239, 248, 253 
agentive adjectives, 89, 319 
agent nouns, 86—87, 255—256, 263, 319 
agglutination. See concatenation 
agglutinative languages, 68, 319 
agreement, 90-92, 99, 103, 149, 162-163, 
319 
interaction with syncretism, 177 
word-internal, 97, 149, 150, 204—205 
allomorphy, 22-27, 69, 319 
complementary distribution of 
allomorphs, 23, 27, 158-159, 
211-214, 322 
conditioning of, 25-27, 214—215, 231, 
323; lexical, 333; morphological, 
335; phonological, 338 
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comparison of inflection and 
derivation, 90, 96, 101, 111-112 
influence on lexical access, 73-75 
triggered by zero affixes, 45, 54 
See also alternations; suppletion 
alphabetisms, 40, 319 
alternations, 23—24, 25, 211—233, 319 
in the architecture of the grammar, 
222-231 
levelling of, 218, 273-274 
productivity of, 217—220; relic 
alternations, 340 
properties of; optionality, 216, 217, 
231; phonetic coherence, 215; 
phonetic distance, 215 
reanalysis of, 220—222, 229-230, 232 
sensitivity to word/morpheme 
boundaries, 69, 199, 202, 206 
types of; automatic, 211—212, 214—217, 
231, 320; morphophonological, 
211-217, 231, 232, 335 
See also allomorph: 
analogy, 127, 132, 171-172, 181, 319 
extension, 127-129, 319 
levelling, 218, 273—274, 276, 319 
proportional equations, 127-129, 339 
analytic languages, 4-5, 320 
animacy, 160, 184, 270—271, 320 
anticausative verbs, 88, 238—240, 244— 
245, 263, 320 
antipassive verbs, 240, 244, 263, 320 
applicative verbs, 88, 242—245, 263, 320, 
321, 340 
appositional compounds, 141, 150, 320 
architecture of the grammar, 8-9, 40-41, 
51—52, 68 
See also morphology-phonology 
interface; syntax-morphology 
interface 
argument structure, 234—235, 236-262, 
320 
coreferentiality, 239 
inheritance, 253-257, 320 
linkage to function structure, 236- 
237, 241-242, 244, 251 
argument-mixing compounds, 249 
aspect, 85, 92, 106, 320 
inflectional values, 82, 84; 
continuative, 323; durative, 326; 
habitual, 330; imperfective, 330; 
perfective, 85 338; progressive, 
339 
attenuative adjectives, 89, 320 


augmentative nouns, 87, 320 
auxiliary verbs, 321 
back-formation, 49, 51, 54, 228, 251, 321 
bahuvrihi compounds. See exocentric 
compounds 
bases, 20-22, 36, 137, 200, 321 
lexical access of, 73-75, 123 
in the morpheme-based model, 43 
semantic relation to derived words, 
90, 93-94, 139-145 
in the word-based model, 47-50, 51 
See also affixes; allomorphy; 
alternations; compounds; stems; 
transposition 
base modification patterns 35-38, 54, 
149, 230, 321 
diachronic sources of, 53, 229 
formal approaches to, 43, 45-46, 48, 
63, 65-66, 75 
influence on lexical access, 73 
beneficiary participants, 243—244, 321 
blending, 40, 321 
blocking, 125-127, 132, 321 
borrowing. See loanwords 
bound forms, 196-197, 206, 321 
bound roots/stems, 21-22, 196, 321 
case, 16, 82, 83-84, 321 
diachronic sources of, 272, 274 
frequency of use, 265-267, 269-272, 
274 
government, 90, 103 
inflectional traits of, 90, 92, 100-101, 
106 
inflectional values 83; abessive, 318; 
ablative, 83, 318; absolutive, 
243, 318, 327; accusative, 82, 83, 
318; allative, 319; dative, 83, 324; 
elative, 327; ergative, 271, 327; 
essive, 327; genitive, 83, 100, 
329; inessive, 331; instrumental, 
100, 332; locative, 100, 270, 334; 
nominative, 82, 83, 100, 265, 
266, 336; objective, 337; oblique, 
234, 236, 237, 243, 244, 257, 337; 
partitive, 338; possessive, 256 
categorial periphrasis, 183-184, 321 
category-conditioned degree of 
productivity, 124, 130-131, 321 
causative verbs, 88, 241-242, 321 
circumfixes, 20, 322 
citation forms, 83, 322 
classifiers, 322 
clausal arguments, 255 
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clefting, 197 
clipping, 40, 322 
clitic groups, 198, 322 
clitics, 196—203, 207, 322, 327, 339 
hosts for, 196-197, 198-202, 330 
and lexical integrity, 203, 205 
simple, 200, 342 
special, 200-203, 207, 342; second- 
position, 200-202, 341 
coalescence, 52, 322 
cognitive realism, 6-7, 11, 40-41, 64, 68, 
74 
combinatory potential, 34-35, 38, 43, 
47, 322 
See also selectional restrictions 
competence, 114-115, 129, 131-132, 322 
complementary distribution (of 
allomorphs), 23, 27, 158-159, 
211—214, 322 
complex words, 2, 18, 33, 34, 137, 322 
borrowing of, 121-122 
storage in the lexicon, 60-63, 67, 70- 
75, 77-80, 227; implications for 
affix order, 227-228; implications 
for blocking; 125-126; 
implications for productivity 
123-124 
complexity-based affix ordering 
hypothesis, 227—228, 322 
compounds, 8, 18-19, 34-35, 48-49, 54, 
323 
agreement with dependent member, 
204—205 
diachronic sources of, 49, 52; 
borrowing of, 121 
distinguishing from affixed words, 
21-22 
distinguishing from phrases, 190- 
195, 206 
heads, 139-144, 147, 149, 150 
inflection of dependent member, 95, 
101, 104 
productivity of, 117, 121, 130, 138 
types of, 137-142; affixal, 48, 252, 
319; appositional, 141, 150, 320; 
coordinative, 141, 144, 150, 324; 
endocentric, 139-141, 143-144, 
147, 150, 327; exocentric, 140-141, 
144, 150, 327; phrasal, 209, 338; 
synthetic, 245, 249—253, 256, 263, 
344 
stress, 192, 195 
valence, 245—253, 263 


concatenation, 34-35, 40-41, 46-47, 
323 
diachronic sources of, 52-53 
in the morpheme-based model, 
41-46, 53, 54, 226 
in the word-based model, 46-47, 
50—51, 53, 54, 230 
reduplication as, 38-39 
conceptual structure, 236, 239, 323 
concrete nouns, 254, 323 
conditioning environments, 25-27, 
214—215, 231, 323 
conjugation, 159, 323 
See also aspect; mood; number; 
person; tense 
constituent structure, morphological 3, 
14, 44, 323 
semantic interpretation of, 44—45, 145, 
250-251 
sensitivity to, 69, 205 
and word-class membership, 261 
See also hierarchical structure 
contextual inflection, 100-102, 104-105, 
107, 109, 323 
controllers, agreement, 91, 97, 103, 162, 
187, 324 
converbs, 86, 324 
conversion, 39—40, 43, 45-46, 54, 245, 
324 
coordination ellipsis, 194, 195, 203, 205, 
206, 324 
coordinative compounds, 141, 144, 150, 
324 
coreferentiality, 239 
correspondences. See morphological 
correspondences 
creativity, 116-117, 132, 325 
cross-formation, 50-51, 54, 324 
cumulative expression, 63—64, 66, 75, 
324 
comparison of inflection and 
derivation, 90, 98, 99 
in the architecture of the grammar, 
105, 107 
deadjectival transposition, 87, 88, 89, 
256—257 
declension, 159, 168-170, 324 
See also case, gender, number, 
person 
default rules/patterns, 171, 181, 324 
defectiveness, 180-182, 184—185, 337 
arbitrary derivational gaps, 93 
definiteness, 325 
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degree, 85, 325 
frequency of use, 266 
inflectional values; comparative, 
85, 93, 183, 322; positive, 266; 
superlative, 85, 266, 343 
degree of exhaustion of a rule, 130, 
325 
degree of generalization of a rule, 129, 
325 
demonstratives, 82 
denominal transposition, 87, 88, 89, 253, 
258, 325 
deobjective verbs, 240-241, 325 
dependent members of compounds, 
139-140, 150, 209—210, 248, 325 
expandability, 194 
inflection of, 95, 101, 104 
order relative to heads, 154-155, 193 
semantic interpretation of, 147, 149, 
191-192, 194-195, 246, 250—252, 
329 
See also noun incorporation 
dependent verb forms, 86, 325 
deponency, 182, 184, 185, 325 
derivation, 18, 27, 81, 149, 225, 325 
meanings, range of, 87-89 
relationship to inflection 
continuum approach, 81, 90, 99, 
102, 105-107, 109, 323 
dichotomy approach, 81, 89, 98- 
106, 107, 326 
typical properties of, 89-98, 111-112 
See also transposition 
derivational phonology, 226, 231 
derivatives, 18, 125, 253, 261, 325 
See also derivation; transposition 
desiderative verbs, 88, 326 
deverbal transposition, 88, 89, 90, 253- 
258, 260-261 
diachronic changes. See analogy; 
back-formation; coalescence; 
inflection classes, diachronic 
shift; metonymic meaning shift; 
phonological reduction; reanalysis; 
sound change 
diachronic productivity, 130, 326 
diminutive nouns, 87, 99, 326 
Distributed Morphology, 54, 109-110 
domains of rule application, 115, 326 
restrictions on, 118-126 
as tests for wordhood, 193, 196, 198, 
202, 223 
duplifixes, 39, 326 


dvandva compounds. See coordinative 
compounds 
economy of expression, 272, 276 
economical description, 61-62, 66, 74 
elegant description, 6, 7, 11, 40 
elsewhere condition, 178-179, 327 
empty morphs, 63, 64-66, 75, 285, 327 
endocentric compounds, 139-141, 143- 
144, 147, 150, 327 
ergative-absolutive languages, 327 
event nouns, 254, 327 
complex event nouns, 254-256, 
322 
simple event nouns, 254—255, 342 
See also action nouns 
events 
conceptual structure, 236, 239, 323 
event-changing operations, 238, 241, 
244, 327 
exocentric compounds, 140-141, 144, 
150, 327 
expandability (test for wordhood), 194, 
327 
experiencer participants, 234-235, 327 
exponence, 269, 276, 328 
external syntax, 260-261 
extraction (test for wordhood), 197 
factitive verbs, 88, 95, 286, 328 
features. See inflectional features and 
values 
female nouns, 87, 93, 95, 271-272, 328 
final devoicing, 198-199, 211—212, 215, 
216 
form-meaning mismatches, 180-182, 
184, 185 
formalisms, 41—50, 54, 70, 157, 166, 
169-171 
formalist orientation, 9, 328 
free forms, 196-197, 200-201, 206, 328 
freedom of host selection, 198, 202, 206, 
328 
freedom of movement, 197, 200—203, 
206, 328 
frequency of use 
of concatenative patterns, 41, 53 
and defectiveness, 181 
local reversals, 270—272, 273, 277 
memory strength, 334; and 
lexical access, 73-74, 75; and 
productivity, 123-127, 132; and 
analogical levelling, 273, 274, 276 
and structural asymmetries in 
inflection, 265-273, 276 
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type frequency and analogy, 128-129 
and word length, 267—268, 272, 277 
fricativization, 213 
fronting, 36, 213, 328 
function structure, 234—236, 251, 256, 
262—263, 329 
function-changing operations, 
329; agent-backgrounding 
236—238; reflexives, 239; patient- 
backgrounding, 240; object- 
creating, 242—243; relationship to 
inflection, 244, 261—262 
linkage to argument structure, 236- 
237, 241-242, 244, 251 
functionalist orientation, 9, 12, 329 
fusion. See cumulative expression 
fusional languages, 329 
gemination, 36, 54, 329 
gender, 82, 106, 329 
agreement, 91-92 
as a head property, 144 
inflectional values, 82; feminine, 328; 
masculine, 334; neuter, 336 
relationship to inflection classes, 
162-163, 184, 185 
generality as a goal of description, 6, 
329 
generative orientation, 9, 11-12, 226, 
329 
generic reference, 191-192, 194—195, 
329 
Germanic bases/ suffixes, 122, 329 
glossing. See notation conventions 
government, syntactic, 90-91, 100, 
148-149, 150, 329 
gradable and non-gradable adjectives, 
93, 330 
grammatical functions, 16-17, 19, 26, 
183 
diachronic sources of, 52 
syntactic relevance, 90 
See also meaning 
grammatical theory, 8, 330 
grammaticality 
acceptability judgements, 129, 134, 
136, 318 
impossible words, 118, 120 
grammaticalization, 52 
grid notation, 107-108, 330 
hapax legomena / hapaxes, 130-131, 
131-132, 135-136, 330 
hapax-conditioned degree of 
productivity, 131, 135 


heads, morphological, 139, 193-195, 330 
formal, 143-144, 146, 147, 149, 150, 
328; as morphosyntactic locus, 
143, 148-149, 335 
semantic, 139-142, 143, 150, 341 
head-final languages, 153, 330 
head-initial languages, 153, 330 
hierarchical structure, 44—45, 137, 142- 
150, 250, 252, 330 
See also constituent structure, 
morphological 
homonymy, 174-176, 330 
See also syncretism 
idiomaticity, 191, 330 
See also meaning, compositionality of 
implicational relationships 
(typological), 153, 331 
implicature, 246, 247, 251-252 
impossible words, 118, 120 
inalienable possession, 190, 331 
inchoative verbs, 88, 93, 331 
incorporation. See noun incorporation 
indeclinable nouns, 165, 331 
infinitives, 86, 331 
infixes, 20, 331 
inflection, 18, 19, 27, 331 
contextual and inherent, 100-102, 
104-105, 107, 109, 323 
diachronic sources of, 202-203 
implications for a word-form lexicon, 
68, 75 
relationship to derivation 
continuum approach, 81, 90, 99, 
102, 105-107, 109, 323 
dichotomy approach, 81, 89, 98- 
106, 107, 326 
relationship to function-changing 
operations, 244 
segmentability of, 63-65 
typical properties of, 89-98 
word-class changing 
(transpositional), 257—263 
See also agreement; conjugation; 
declension; government, 
syntactic; inflection classes; 
inflectional features and values 
inflection classes, 144, 146, 179, 184, 
269—270, 331 
assignment of words to, 160-161 
diachronic shift, 163, 165-167, 176, 
184, 331 
macroclasses and microclasses, 
170-171 
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productivity of, 163-165, 184, 185 
relationship to gender, 162-163 
See also inheritance hierarchies 
inflectional features and values, 81-86, 
332 
cross-cutting, 270, 276 
expression by multi-word 
constructions (periphrasis), 
183-184 
frequency of use, 265-277 
notation conventions, 107-109, 328, 
330 
underspecification of, 176-178 
See also agreement; aspect; 
case; gender; government; 
inflection; mood; number; 
person; tense 
inhabitant nouns, 30, 87 
inherent case, 100-102, 104-105, 107, 
109, 332 
inherent inflection, 100-102, 104-105, 
107, 109, 332 
inheritance of arguments. See argument 
structure, inheritance 
inheritance hierarchies, 169-171 
cross-classification, 170 
innateness, 9, 44, 225 
See also Universal Grammar 
instrument nouns, 87, 140-141, 246 
integrated affixes, 223-227, 231, 332 
intensive adjectives, 89, 332 
interfixes, 139, 332 
internal syntax, 260-261 
intransitive verbs, 242, 246—251, 332 
irregularity 
and frequency of use, 274—276, 277 
and lexical access, 75 
morphological, 159, 165 
phonological, 124 
semantic, 94—95, 100, 124, 132 
isolating languages, 5-6, 333 
iteration, 98 
Latinate bases/suffixes, 122 
learning, 122 
lengthening, 36, 38, 39, 54, 333 
lenition. See weakening 
level ordering hypothesis, 222-223, 
226—228, 231—232, 333 
levelling. See analogy 
lexemes, 15-18, 27, 333 
word-class membership, 260-263 
See also derivation; compounds; 
transposition 


lexical access, 33, 72-75, 333 
memory strength, 334; and 
productivity, 123-127, 132; and 
analogical levelling, 273, 274, 276 
via decomposition, 72-75, 106, 324 
via whole-word / direct-route, 72-75, 
326 
See also parallel dual-route processing 
models 
lexical class. See word-class 
lexical conditioning, 26, 27, 214, 333 
Lexical Integrity Hypothesis, 203-206, 
206—207, 333 
lexical periphrasis, 183, 334 
Lexical Phonology, 232 
See also level ordering hypothesis 
lexicon, 33, 60—75, 103-106, 122-129, 
132, 227, 334 
entries (items) in, 33, 43, 60, 236, 333 
morphemic, 61—66, 68, 75, 335 
word-based, 61, 66-69, 70-74, 75, 
123, 324 
lexical gangs, 128, 132, 333, 340 
lexical neighbourhoods, 334 
redundancy of storage, 70-71, 169 
See also lexical access 
loanwords, 163-165, 184, 216-219, 334 
borrowed vocabulary strata, 121-122, 
132 
locative applicative operations, 242, 
334 
markedness, 277 
masdars, 86, 258—260, 262, 334 
mass-count distinction, 160, 324 
meaning, 2, 11, 14-16 
abstractness, 19, 21, 27, 90, 94 
compositionality of, 50, 62, 75, 173, 
323; comparison of inflection and 
derivation, 90, 94—95, 97, 99, 101, 
107; compounds, 191; clitic-host 
combinations, 199—200, 202. 
conceptual structure, 236 
derived from constituent structure, 
44—45, 144—145, 250 
predictability of, 17-18, 60—61, 66, 67 
restrictions on word-formation, 119 
role of pragmatic implicature in, 
139-140, 247, 251—252, 256 
semantically empty morphs, 64-65, 
139, 327 
See also blocking; form-meaning 
mismatches; valence 
memory. See lexical access, lexicon 
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mental processing. See lexical access 
metathesis, 37, 53, 54 
metonymic meaning shift, 254 
moderate word-form lexicon, 61, 70-74, 
123, 324 
monomorphemic words, 15, 121, 223, 
224, 227, 334 
mood, 82, 84-85, 92, 106, 334 
inflectional values 82; conditional, 
84, 323; hypothetical, 85, 330; 
imperative, 82, 84, 85, 271, 330; 
indicative, 82, 84-85, 266, 268, 
331; optative, 337; subjunctive, 
82, 84, 266, 343 
morpheme structure conditions, 69, 
223, 225, 227, 230, 335 
morpheme-based model, 335 
approach to morphophonological 
alternations, 222, 231 
lexicon, 61-66, 75, 335 
rules, 41-46, 53, 54, 226 
syntagmatic approach to description, 
157, 167, 178, 182 
morphemes, 3, 11, 14-15, 19-27, 34, 54, 
335 
boundaries between, 69, 205, 227- 
228, 230 
See also affixes 
morphological conditioning, 26, 231, 
335 
morphological correspondences, 47-50, 
54-55, 335 
morphological patterns, types of, 34-40, 
335 
morphology-phonology interface, 8-9, 
61, 103-105, 220, 222-231 
morphophonological alternations. See 
alternations 
morphophonological rules, 23-25, 198, 
227, 231, 232, 335 
morphs, 334 
movement tests, 197, 200-203, 206, 
328 
nasal assimilation, 225 
nasal substitution, 219 
natural classes, 174, 176, 176-177, 215, 
336 
neologisms, 71, 115-117, 132, 218-219, 
336 
blocking of, 125-126 
measuring productivity, 130-131, 
163-164, 184 
neutral affixes, 223-227, 336 


nominative-accusative languages, 336 
nonce formations. See occasionalisms 
non-words, 336 
notation conventions, 4, 27-29 
feature-value notation, 109, 328 
grid notation, 107-108, 330 
interlinear morpheme-by-morpheme 
glossing, 27-29, 332 
tree diagrams, 44—45, 142-143, 344 
nouns, 7-8, 82, 83, 85, 148 
derivational types, 87-88; action, 67, 
87, 99-100, 120, 133, 251-255, 
258, 263; agent, 86, 255-256, 
263, 319; augmentative, 320; 
diminutive, 99, 326; event, 
254—256, 322, 327, 342; female, 
93, 95, 271—272, 328; inhabitant, 
30; instrument, 140-141, 246; 
person, 271; quality, 86, 126, 340; 
status, 87 
concrete, 254, 323 
count, 160, 324 
derived, 87, 256 
noun incorporation, 138, 150, 191, 245- 
247, 263, 336 
noun phrases, 41, 42, 91, 148, 337 
number, 16, 82, 83-85, 337 
inflectional values, 83; dual, 83, 179, 
266, 326; paucal, 83, 338; plural, 
7, 82, 83, 338; singular, 16, 82, 83, 
265—267, 342. 
Object arguments, 83, 84, 234—247 
object-creating operations, 242-243 
obligatoriness of inflection, 92-93, 
107 
oblique arguments, 238, 257 
occasionalisms, 71, 130-131, 136, 337 
Optimality Theory, 232 
palatalization, 36 
alternation, 212, 215, 219, 221-222, 
228 
morphological pattern by itself, 36, 
54, 337 
paradigmatic gaps. See defectiveness 
paradigmatic periphrasis, 183-184, 337 
paradigms, 16-17, 19, 27, 156-188, 337 
in the architecture of the grammar, 
156-158, 165-185 
cells, number of, 93, 176-177, 322 
paradigm rules, 166-167, 168, 171, 
182, 337; underspecification of, 
177 
rule schemas, 168-171, 179, 341 
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parallel dual-route processing models, 
72, 75, 326 
parsing ratios, 123-124, 337 
participles, 86, 172-173, 257-261, 337 
patients, 85, 87, 234—235, 255, 338 
incorporated, 245-252 
passive constructions, 237-239 
patient-backgrounding operations, 
240-241, 243, 247 
performance, 114-115, 131, 132, 338 
periphrasis, 183-184, 321, 334, 338 
person (inflectional feature), 82, 84, 91, 
106, 338 
inflectional values, 84; first, 84, 181- 
182, 187, 327, 328, 331; second, 
84; third, 84 
person nouns, 87, 271 
phonetic motivation 
for alternations, 214-215 
for restrictions on morphological 
productivity, 119 
phonological allomorphy, 23-25, 73 
phonological conditioning, 25, 26, 338 
phonological reduction, 52, 275-276 
phonological rules, 196-197, 202, 221 
phonology, 2, 3, 220-221, 222-230, 
231-232 
sensitivity of, to morphological 
structure, 69, 230-231 
See also morphology-phonology 
interface 
phrasal compounds, 209, 338 
phrase structure rules, 41-43, 103 
poetic licence, 117, 339 
polarity, 86, 339 
inflectional values 86; affirmative, 
265, 266; negative, 89, 265-266 
polysynthetic languages, 5, 138, 339 
portmanteau morphs, 64, 339 
possessors, 83, 251, 253, 262 
possible words, 34, 71, 114, 120, 130, 339 
postpositions, 152, 339 
potential words. See possible words 
prefixes, 20, 34-35, 69, 227, 339 
Priscianic formation, 172-174, 184, 339 
privative adjectives, 89, 339 
processing. See lexical access 
productivity, 67, 114, 181, 232, 339 
of alternations, 217—220, 231 
compared to creativity, 116-117, 132, 
325 
of compound patterns, 138, 140 
gradience of, 116-117 


of inflection classes, 163-165, 181, 
184, 185 
measuring, 118, 124, 129-131, 132 
restrictions on, 117-129 
speakers' knowledge of, 114-116 
unproductive patterns, 49, 67, 124, 
126, 132, 345; relationship to 
restrictedness, 130 
of valence-changing operations, 245 
profitability of a rule, 129, 325 
pronouns, 82, 91, 199, 206 
anaphoric, 194-195, 206 
See also clitics 
proprietive adjectives, 89, 258-260, 
339 
prosodic dependence of bound forms, 
196-197, 340 
prosodic words, 196, 203, 340 
pure stem, 138, 340 
purposive, 340 
quality nouns, 86, 87, 126, 340 
reanalysis, 55, 202, 221-222, 229 
reduplication, 38-39, 43, 48, 54, 340 
referentiality, 100, 141, 191-192, 194- 
195, 206, 340 
reflexivity, 239, 244, 247, 263, 340 
regularity of inflection, 75, 124, 132, 
274—276, 277 
relevance of inflection to the syntax, 
90-92, 101-106, 257—263 
relic alternations, 217—218, 220, 340 
repetitive verbs, 88, 340 
resultative verbs, 238—239, 244, 263, 
340 
retrieval from memory. See lexical 
access 
reversive verbs, 88, 341 
roots, 21—22, 23, 27, 34, 341 
rules of referral, 179-180, 341 
rule schemas, 168-171, 179, 341 
rules, 60, 75, 335 
domains of, 115 
in the morpheme-based model, 
41-46, 53, 54, 226 
in the word-based model, 46-53, 
54—55, 141, 157, 165, 335. See 
also paradigms, paradigm rules; 
rule schemas; morphological 
correspondences 
schemas. See rule schemas; word- 
schemas 
second-position clitics, 200-202, 341 
segmentability, 64—65, 189 
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selectional restrictions, 34, 115, 341 
See also combinatory potential 
semantic roles, 85, 234—253, 341 
See also agents; beneficiary 
participants; experiencer 
participants; patients; source 
participants; stimulus 
participants; theme 
semantic scope, 145, 341 
semantic valence. See argument 
structure 
semantic-role structure. See argument 
structure 
separability (test for wordhood), 193- 
195, 341 
shortening 
alternation, 213-215, 225-226 
morphological pattern by itself, 
37-38, 54, 341 
sign languages, 2, 3, 12 
simple clitics, 200, 342 
single-component hypothesis, 106, 107, 
342 
sound change, 214—215, 220 
source participants, 234-236 
special clitics, 200—203, 207, 342 
speech style, 216, 217 
Split Morphology Hypothesis, 102-105, 
107, 109, 342 
stative verbs, 93, 238, 342 
status nouns, 87 
stems, 20—22, 23, 172-174, 342 
alternations, 36—38 
suppletion, 25-26, 27, 172, 276 
stimulus participants, 234—235, 342 
storage. See lexicon, lexical access 
stress, 223-227 
clitic groups, 198 
compound, 192, 195 
contrastive, 196-197 
stress shift, 37, 54, 342 
strict word-form lexicon, 61, 66—69, 70, 
75, 342 
structural cases, 100, 343 
structure preservation, 216, 343 
subcategorization frames. See 
combinatory potential 
subject arguments, 82, 83, 84, 234-235 
subtraction, 37, 54, 343 
suffixes, 20-21, 34, 343 
suppletion, 24-26, 27, 172, 218, 276, 
343 
affixal, 25, 26; basis for inflection 


classes, 158-165, 184 
and clitics, 199, 202 
surface representations, 23-25, 221, 
343 
syncretism, 184, 185, 343 
compared to accidental inflectional 
homonymy, 174-176 
formal description of 
rules of referral, 179-180, 341 
underspecification, 176-178, 185, 
345 
and frequency of use, 268—273, 276 
natural, 176,177, 336 
synonymy blocking. See blocking 
syntactic functions, 82, 85, 234—244, 251, 
262, 343 
syntactic valence. See function structure 
syntax, 41-44, 67, 137, 145, 178, 
260—262 
agreement and government, 90-92, 
100, 163, 
as a diachronic source of 
morphology, 52-53 
heads in, 147-149, 150, 153-155 
phrases compared to words, 190-206 
productivity of, 114, 123 
See also external syntax; internal 
syntax; syntax-morphology 
interface 
syntax-morphology interface, 8-9, 
61-62, 109-110 
comparison of morphology and 
syntax, 44-45, 147-149, 150 
feature-value compatibility, 177, 
328 
feature-value identity, 177, 328 
Lexical Integrity Hypothesis, 203- 
206, 206-207, 333 
morphosyntactic representations, 
103-105, 105-106, 335 
morphosyntactic features, 143, 206, 
335 
single-component hypothesis, 
105-106 
Split Morphology Hypothesis, 
102-105 
See also inflection 
synthetic compounds, 245, 249-253, 
256, 263, 344 
synthetic languages, 4-5, 344 
system-external explanation, 6-9, 11, 41, 
53, 106, 265 
targets, agreement, 344 
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tense, 82-85, 92, 344 
inflectional values, 84; aorist, 320; 
future, 82, 329; past, 82, 83, 
84, 85, 338; perfect, 183, 338; 
pluperfect, 183; present, 81-85, 
181, 266, 339 
tests for wordhood, 191-202, 206 
thematic relations. See semantic roles 
theme, 234—236, 248-249, 344 
tonal change, 37, 38, 54, 344 
topicalization, 197 
transitive verbs, 248, 255—256, 266, 344 
transposition, 87-89, 90, 96-98, 253-262, 
344 
tree diagrams, 44—45, 142-143, 148, 344 
trisyllabic shortening, 213, 214-215, 
225-226 
underlying representations, 23-25, 221, 
344 
underspecification, 176-178, 185, 345 
uninflectedness, 92, 104, 149 
Universal Grammar, 9, 206, 345 
universal patterns, 7 
usual words. See actual words 
valence, 234—263, 345 
valence-changing operations, 234— 
245, 262 
velar softening, 226 
verbs, 82, 83, 84-87, 88 
auxiliaries, 321 
dependent verb forms, 86 
derivational types 88; anticausative, 
88, 238—240, 244—245, 263, 320; 
antipassive, 240, 244, 263, 320; 
applicative, 88, 242-245, 263; 
causative, 88, 241—242, 321; 
deobjective, 240—241, 325; 
desiderative, 88, 326; factitive, 
88, 95, 286, 328; inchoative, 
88, 93, 331; repetitive, 88, 340; 
resultative, 238—239, 244, 263, 
340; reversive, 88, 341 
transitivity, 242, 246—251, 255-256, 
266, 332, 344 
statives, 93, 238, 342 
voice, 85, 161, 266, 345 
inflectional values 85; active, 85, 219, 
269; passive, 85, 183, 237-239, 
244, 263, 265, 338 
vowel harmony, 193, 195, 198, 345 


vowel lengthening. See lengthening 
Wackernagel clitics. See second-position 
clitics 
weakening, 36, 38, 54, 229, 345 
word boundaries, 189-190, 193, 196, 
216-217, 231 
word families, 17, 18, 62, 346 
word formation, 18, 19, 346 
in the architecture of the grammar, 
103-105 
restrictions on, 117-122 
See also compounding; derivation 
Word and Paradigm morphology, 54 
Word Syntax, 54, 109-110 
word tokens, 15, 16, 27, 131, 189, 346 
word-based model, 41, 46—53, 54, 61, 
67-69, 70-74, 75, 128-129, 165-167, 
177-178, 345 
lexicon, 61, 66-69, 70-74, 75, 123, 
324 
paradigmatic approach to 
description, 156-158, 165-185 
rules, 46-53, 54-55, 141, 157, 165, 
335. See also paradigm rules; 
rule schemas; morphological 
correspondences 
word-class 35, 43, 183, 345 
comparison of inflection and 
derivation, 90, 96-98, 101 
determined by compound head, 144 
determined by derivational affixes, 
35, 87, 146 
word-class-changing 
(transpositional) derivation, 
87-89, 253-257 
word-class-changing 
(transpositional) inflection, 
97-98, 257-262, 263 
word-forms, 15-19, 21, 27, 346 
words, 1-3, 15-19, 345 
distinguishing from phrases, 189-210 
word-schemas, 46-53, 54, 128, 166, 
346 
zero affix (zero expression), 45, 346 
lack of overt inflectional exponence, 
92, 158; and frequency of use, 
267—268, 272, 276 
problems for the morpheme-based 
model, 45-46, 54, 63, 64-66, 75, 
230 


